AD  AO  67166 


THE  USE  OF  MODELS  IN  IMAGE  ANALYSIS 


Q- 

O 


GEORGE  C.  STOCKMAN 
STEVEN H.  KOPSTEIN 

L.N.K.  CORPORATION 

302  NOTLEY  COURT 

SIL  VER  SPRING,  MAR  YLAND  20904 


JANUARY  1979 


Approved  for  public  release ; distribution  unlimited. 


AEROSPACE  MEDICAL  RESEARCH  LABORATORY 
AEROSPACE  MEDICAL  DIVISION 
AIR  FORCE  SYSTEMS  COMMAND 
WRIGHT-PATTERSON  AIR  FORCE  BASE,  OHIO  45433 


i 


1 

i 


notices 


When  US  i^^£^  tSStii'tlS 

Government  procurement  °Per*l>°2’ th  tove  formulated,  famished,  or  in  any  wey  supplied  the  said  drawings, 

SOeV-« ' *he  ^l^tj^^no^wbe  r«»rded  byteS>Bcetion  or  otherwise,  at  in  any  manner  licensing  the  holder 

invention  that  may  in  nny  wny  be  related  thereto. 


Please  do  not  request  copies 
purchased  from: 


of  this  report  from  Aerospace  Medical  Research  Laboratory.  Additional  copies  may  be 

National  Technical  Information  Service 
5285  Port  Royal  Road 
Springfield,  Virginia  22161 

Federal  Government  agencies  and  their  contractors  registered  with  Defense  Documentation  Center  should  direct 
requests  for  copies  of  this  report  to. 

Defense  Documentation  Center 
Cameron  Station 
Alexandria,  Virginia  22314 


TECHNICAL  REVIEW  AND  APPROVAL 

AMRL-TR-78-117 


This  report  has  been  reviewed  by  the  Information  Office  (01)  and  is  releasable  to  the  National  Technical  Information 
Service  (NTIS).  At  NTIS,  it  will  be  available  to  the  general  public,  including  foreign  nations. 

This  technical  report  has  been  reviewed  and  is  approved  for  publication. 

FOR  THE  COMMANDER 


S*'  T C ^Vv- 
HENNWa  E.  VON  GIERKE 
Director 

Biodynamic  b and  Bioengineering  Division 
Aerospace  Medical  Research  Laboratory 


am  ronce/ss7so/is  M»r«<  1*7*  - iso 


>tCU«|T>«w^iVriC»TION  or  THU  NAPE  f>i«n  D.f»gn<«mt) 

(/?  j REPORT  DOCUMENTATION  PAGE  befor^co^uetSS'Vorm 

I Hrt<u»T<lllu«r  A 1,  C.nVT  irCMlIOj  NO.  ‘i  NtafWhllVS  CATALOG  NUMBER 


& 


I 


AMRL/TR- 78-117 


THE  USE  OF  MODELS  IN  _IMAGE  £J*ALYSIS  » 


p 1 • ■ ' Technical  jfepmt, 
i 1 Nov  76 -pi  31  Aug  78  , 

l-  El  I Bill  BIIWIIH  EIHi  ■■■!>■!  rfllUtj 


I *.  CONTRACT  ON  GRANT  NUMBENfAl 


George  C/ Stockman 
Steven  H./lCopstein 


F33615 


-76-0-/' 


521  H 


PERFORMING  ORGANIZATION  NAME  AND  ADORESS 


PROGRAM  ELEMENT.  PROJECT.  TASK 
AREA  A MQRK  Limit  T NUMBERS 


L.N.K.  CORPORATION 
302  Notley  Court 


^25T/  7) 


IH.  CONTROLLING  OFFICE  NAME  AND  ADORESS 


Aerospace  Medical  Research  Laboratory,  Aerospace 
Medical  Division,  Air  Force  Systems  Command 


MONITORING  AGENCY  NAME  A ADDRESSff/j 


*5 


, IS.  SECURITY  CLASS,  (ol  (him  rmport) 


Unclassified 


Mm.  DECLASSIFICATION/ DOWNGRADING 

SCHE0ULt  N/A 


116  DISTRIBUTION  STATEMENT  (ol  (him  Rmport) 


Approved  for  public  release;  distribution  unlimited. 


[ 17.  DISTRIBUTION  STATEMENT  (ot  thm  mbmlrmcl  entered  In  Block  20,  II  dlllmrmnt  from  Rmport) 


18.  SUPPLEMENTARY  NOTES 


1 19.  KEY  WORDS  (Contlnum  on  rmvmrmm  mldm  II  nmcmmmmey  and  Idmntlty  by  block  number) 


feature  extraction.  Image  processing,  Image  understanding,  interactive 
imagery  screening,  models,  object  detection,  registration. 


ABSTRACT  ( Contlnum  on  rmvmrmm  mldm  II  nmcmmmmry  and  Idmntlty  by  block  number) 


DD  | jAN*Tl  1473  COITION  OF  1 NOV  EE  IS  OBSOLETE 


Several  of  the  automatic  components  of  an  interactive  Imagery  screening  and 
target  detection  system  are  studied.  Components  include ~l]j  a module  for  detec- 
tion of  primitive  image  features  without  model  hypothesis,  2) 'a  module  for  gen- 
eration of  hypotheses  given  that  a particular  model  for  the  data  has  been  evoked 
by  primitive  feature  content,  and  3)  a module  for  testing  the  model  generated 
hypotheses  against  the  data.  Only  shape  features  extracted  from  boundary  seg- 
ments are  considered  in  this  work.  Features  used  are  straight,  circular,  and  — 
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‘parabolic  edge  segments  and  points  of  intersection  or  points  of  high  curvature 
on  edge  segments.  Feature  extraction  algorithms  are  discussed.  Two  different 
model  types  are  considered — problem  reduction  representations  (PRR)  which  corres- 
pond to  grammar  models  and  iconic  models  which  correspond  to  cartographic  data 
bases.  Once  primitive  features  from  an  image  are  aligned  with  a potential  model 
the  model  generates  hypotheses  about  the  data  which  can  be  either  verified  or 
denied.  Testing  of  hypotheses  is  done  via  curve  fitting  or  template  matching. 

A new  approach  to  registration  is  presented  and  documented  via  experimental 
results.  The  registration  approach  enables  the  use  of  cartographic  data  bases 
as  models  for  use  in  image  analysis  and  object  detection. 
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Summary 


A study  of  the  use  of  models  in  image  analysis  is  reported.  Models  are 
structured  a priori  information  which  can  be  used  to  interpret  data  in  a manner 
consistent  with  real-world  knowledge.  Potential  models  are  selected  by  prim- 
itive feature  extraction.  Primitive  features  studied  in  this  research  were  all 
derived  from  boundary  curve  segments,  i.e.  edges,  of  the  image.  Two  types  of 
models  were  considered  for  encoding  real-world  structural  knowledge.  One  model 
studied  was  Problem  Reduction  Representation  (PRR)  or  equivalently  the  Context 
Free  Grammar  (CFG)  which  ger.erlcally  specifies  structure.  The  second  type  of 
model  considered  was  the  Geographic  Data  Base  (GDB)  which  iconically  encodes 
particular  shape  features  to  be  seen  in  aerial  imagery. 

Whatever  model  is  used,  primitive  features  are  required  to  align  a hypoth- 
etical model  with  raw  image  data.  Further  analysis  is  then  made  by  verification 
of  structural  hypotheses.  Verification  is  treated  here  as  either  template  match- 
ing or  curve  fitting  under  constraints. 

The  study  attempted  to  draw  conclusions  about  an  entire  image  screening  system 
by  studying  several  possible  parts.  Many  experiments  were  performed  and  a large 
amount  of  literature  reviewed.  As  a result  of  the  study,  the  following  conclusions 
were  reached. 

. Useful  interpretation  of  imagery  requires  that  instances  of  sensed  data  be 
integrated  with  large  amounts  of  stored  real-world  knowledge. 


1 


1 


Representation  of  real-world  knowledge.  particularly  for  uae  by  a computer, 

‘S  “ d''f‘CUlt  ‘“■k  “,th  "uch  Curre"‘  research  uctlvlty.  Ceuerlc  model, 
aucb  a,  PRR  or  CPC  are  difficult  to  uae  practice  bu,  particular  Icoulc 
shape  models  appear  to  have  practical  potential. 


Current  au, emu, tic  primitive  feature  detection  technique,  can  aupport  complex 
analysis  when  features  are  registered  to  an  Iconic  model. 


evaluating  and  combining  confidence  values  for  verifying  hypothec,  about 

image  structure  1.  difficult  In  both  theory  and  practice  and  require,  further 
work. 


The  most  promising  f„,„re  direction  for  reconnaissance  Image  analysis  appears 
to  be  toward  map-guided  image  analysis. 
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1.  Introduce  ion 


Efficient  automatic  or  semiautomatic  analysis  of  aerial  imagery  is  .1 
problem  of  great  practical  interest  to  the  Air  Force.  The  complexity  of  the 
image  Interpretation  tasks  of  target  identification  and  target  location  lias 
thus  far  been  too  great  for  automatic  processing.  However,  success  has  been 
achieved  in  the  recognition  of  simple  or  stereotyped  objects  [Ashbaugh,  1173) 
and  in  the  automatic  verification  of  certain  mapped  features  in  imagery  (Harrow 
1077).  The  computer  can  outperform  a human  in  some  detection  tasks  and  has  the 
virtue  of  being  indefat igtieab le  in  its  efforts.  it  is  natural  then  to  attempt 
a man-machine  synthesis  whereby  image  analysis  would  be  achieved  with  each  com- 
ponent performing  the  tasks  which  it  does  best. 

Investigation  of  interactive  screening  ot  reconnaissance  imagery  was  begun  by 
I..N.K.  CORPORATION  in  October  of  1974  under  Contract  F 33bl5-75-C-505b.  Preliminary 
results  were  reported  In  Stockman  and  Kana l |197b|  and  recommendations  for  future 
work  were  made.  This  report  summarizes  the  results  achieved  during  a follow-up 
investigation  of  certain  subproblems  broken  down  in  the  initial  study.  Figure  1. 
shows  the  possible  flow  of  information  and  control  in  an  Interactive  Imagery 
screening  system.  Stations  1 and  are  used  for  Imagery  for  which  no  previous 
computer-stored  analysis  exists.  Primitive  detectors  are  applied  at  station  1 to 
detect  features  common  to  targets  - straight  edge  activity,  corners,  parallel  edges, 
or  symmetrical  edge  activity.  If  any  such  features  are  detected  the  imagery  is 
examined  further  at  station  2 where  object  models  are  tested  against  the  data  auto- 
matically. If  any  targets  are  detected  the  human  analyst  (station  4)  is  alerted 
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for  further  Interpretation  of  the  imagery  and  tor  compl  Int  ton  ot  a symbol  to 
Imago  tor  the  data  baso  tv'  bo  used  In  repeat  coverage  at  sv'mo  later  ditto. 

Whenever  imagery  Is  Input  to  the  svstom  and  symbol  to  oovorago  exists  In  tbo 
data  baso,  an  aut  omat  to  process  (station  l)  attempts  to  match,  or  rogtsior, 
tbo  now  raw  tmagory  tv'  tbo  symbolic  imagery.  Apparently,  much  matching  can  bo 
done  automatically  (Barrow,  l‘)7  7 :St  ockman , 1 *1 78 1 at  this  stage.  Significant  dis- 
crepancies detected  during  the  matching  of  data  tv'  archive  must  be  brought  to 
the  attention  of  the  human  interpreter  for  further  analysis. 

Kor  Imp  lenient  at  ton  ot  a system  as  described,  blocks  I,  and  l are  problem- 
atical because  they  involve  computer  vice  is l on-mnk ing . This  report  examines  poss- 
ible implementations  for  blocks  1,  , and  l ot  the  interactive  screening  svstom. 

Section  .’  ot  this  report  deals  with  primitive  detection;  that  is,  with  the  auto- 
matic recognition  ot  primitive  image  features  without  henelit  ot  context  or  htghci 
level  knowledge.  Curves,  straight  lines,  corners,  and  points  o!  high  curvature 
are  discussed  as  important  primitives.  The  primitives  are  usetul  not  on  I v in 
block  l as  evidence  of  cultural  activity  but  also  In  block  l to  aut oma t I ca I l y est- 
ablish a correspondence  (registration)  between  the  Image  and  a map  in  the  archive. 
Section  ' discusses  the  use  ot  grammar  models  lor  oh|ect  recognition  and  section  a 
cons  tilers  oh|ecl  delect  ion  as  regtstrut  ion  ol  image  edges  tv'  object  model  edges. 

The  automatic  recognition  ot  lull  objects  is  required  1 or  succosslul  imp  1 omont at  ion 
ot  bicek  7 while  the  registration  technique  developed  in  section  ■< 
the  reglstrat Ion  required  in  block  \ ol  Figure  l. 
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Conclusions  are  rendered  in  Section  6 of  this  report.  Very  briefly,  it 
can  be  said  here  that  all  practical  image  analysis  problems  require  the  input 
of  information  from  sources  other  than  the  input  imagery  itself.  Representation 
of  this  outside  knowledge  for  use  by  an  automatic  process  is  one  of  the  most 
interesting  and  difficult  problems  under  current  research.  Perhaps  the  most 
promising  current  alternative  is  to  use  positional  knowledge  as  encoded  in  present 
day  cartographic  data  bases.  By  using  such  symbolic  data  bases  and  a system  such 
as  that  in  Figure  1.  it  should  be  possible  to  do  analysis  of  repeat  coverage  much 
more  rapidly  and  accurately  than  original  coverage.  In  this  manner,  a large  initial 
investment  in  human  analysis  can  provide  machine  useable  knowledge  for  future  payoffs 
in  automatic  analysis. 


2.  Primitive  detection 


i * 


This  section  discusses  the  extraction  of  primitive  features  from  grey 
scale  imagery.  All  features  used  here  are  edge  dependent  features  in  the 
sense  that  their  detection  depends  on  detection  of  the  boundary  between  two 
regions  of  contrasting  grey  scale.  Edge  primitives  or  edge  elements  are  not 
single  contrast  points  but  rather  a minimum  collection  of  them  defining  a 
connected  and  continuous  segment  of  a boundary.  It  is  a common  view  [Marr  1975] 
that  such  edge  elements  form  the  basis  on  which  higher  level  human  recognition 
processes  operate. 

Image  points  of  high  contrast  can  be  automatically  identified  by  a number 
of  mathematical  techniques  of  varying  complexity.  A survey  of  edge  detection  is 
given  in  [Davis  1975]  and  experiments  are  reported  in  [Bullock  1974]  and  [Rosenfeld 
1971].  After  reviewing  the  literature  on  edge  detection  and  experimenting  with 
several  techniques  the  author  has  arrived  at  the  following  conclusions. 

. Due  to  lack  of  contrast,  edge  operators  cannot  be  expected 
to  extract  all  the  edge  points  from  any  real  world  scene. 

. Due  to  image  noise  there  will  be  automatically  extracted  edge 
points  in  places  where  a human  will  not  perceive  them. 

. The  ideal  edge  content  of  any  real  world  scene  cannot  generally 

be  extracted  without  considerable  semantic  information  from  sources 
outside  of  the  imagery. 

13 


— 


. If  a small  part  of  the  outside  semantic  Information  used 
by  a human  in  image  interpretation  were  available  to  an 
automatic  process,  most  of  the  current  edge  detection  schemes 
would  be  entirely  adequate  to  support  complete  image  analysis. 

The  fourth  point  made  asserts  an  optimistic  view  in  spite  of  the  three  initial 
negative  remarks.  It  seems  clear  that  research  on  primitive  edge  detection  should 
be  curtailed  while  work  on  semantic  interpretation  and  use  of  knowledge  should  be 
pursued.  It  is  assumed  in  tills  report  that  current  simple  edge  detection  operators 
are  sufficient  for  capturing  an  essential  representation  of  a scene.  An  essential 
representnt ion  is  one  that  supports  semantic  interpretation  and  hypothesis  formation 
which  can  in  turn  be  used  for  driving  more  focused  edge  detection  operations.  The 
rest  of  tills  section  discusses  rather  simple  and  economical  methods  for  extracting 
partial  edge  content  from  imagery  to  be  used  for  higher  level  interpretation.  The 
fact  that  these  simple  operators  fall  in  many  cases  does  not  preclude  correct  image 
analysis  since  higher-level  operators  will  come  into  play. 
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2.1  Extraction  of  smooth  edge  elements 


Very  general  boundary  curve  detection  consists  of  two  simple  steps  - - 
first,  high  contrast(edge)  points  are  identified  in  the  image  and  secondly  sets 
of  these  points  are  organized  into  continuous  curve  segments.  Due  to  noise  and 
low  contrast  it  is  unreasonable  to  expect  unbroken  boundary  curves  as  a result 
of  such  general  low-level  processing.  Higher  level  processing  using  geometric 
or  topological  constraints  can  connect  curve  segments  into  complete  boundary 
structures.  Partial  semantic  interpretation  of  the  image  may  be  necessary  in 
order  to  reliably  connect  curve  segments.  Features  of  existing  curve  segments 
such  as  length,  curvature,  degree  of  match  to  a stored  prototype  may  be  vised  in 
the  interpretation  and/or  connection  decisions.  This  section  addresses  only  very 
general  low-level  curve  segment  extraction  which  is  appropriate  for  arbitrary  pro- 
blem domains.  Enhancement  and  interpretat ion  of  the  curve  segment  set  via  specific 
semantics  is  the  topic  of  future  work. 

Curve  segments  representing  the  image  data  can  be  extracted  in  a 3-step  pro- 
cess. First  of  all,  all  image  points  are  examined  and  a set  of  high  contrast  points 
is  extracted.  A Roberts'  type  gradient  operator  is  applied  to  each  point  and  a 
gradient  magnitude  and  direction  are  extracted.  b.N.K.  has  obtained  good  results 
by  keeping  only  that  5 Z of  image  points  which  have  the  highest  gradient  magnitude 
(contrast).  The  second  process  examines  a small  neighborhood  around  each  edge  point 
extracted  in  step  1 and  finds  the  best  continuing  edge  point  in  the  forward  and 


I r> 


backward  direction.  Links  are  set  pointing  to  the  best  continuation  points 
if  such  points  exist  in  the  high  contrast  set.  These  links  are  established 
independently  (theoretically  in  parallel)  for  each  high  contrast  point.  The 
third  step  extracts  curve  segments  as  chains  of  high  contrast  points  mutually 
linked  together  in  step  2 of  the  procedure. 


2.1.1  Step  1:  extraction  of  high  contrast  edge  points 


L.N.K.  has  developed  a gradient  operator  based  on  masks  which  allows  gradient 


direction  to  be  optionally  computed  at  resolution  of  1/8,  1/16,  or  1/J2  of  the 
circle  (i.e.  45°,  22^1° , or  1 1 Va 0 ) . Gradient  magnitudes  are  histogrammed  and  a 
fixed  percentage  of  the  highest  contrast  points  are  selected.  Recent  work  has  been 
done  with  2%,  5% , or  10%  of  the  image  points.  High  contrast  points  are  saved  in 
array  storage  outside  of  the  image  storage.  Appendix  A documents  the  simple  edge 
operator  based  on  masks. 


1.1.2  Step  2:  finding  continuing  points  hv  local  processing 


The  neighborhood  of  each  high  contrast  point  is  independently  examined  by  a 
spiraling  search  around  the  given  point.  (see  Figure  2.  ) Neighbors  closest  to 
the  point  are  considered  first  and  only  neighbors  within  a fixed  radius  r are 
examined.  Links  are  established  to  the  first  high  contrast  point  with  acceptable 


gradient  direction  continuing  a curve  in  either  the  forward  or  backward  direction. 


The  amount  of  curvature  to  be  tolerated  in  the  curve  is  expressed  as  a tolerance 

on  the  agreement  of  gradient  directions.  The  forward  direction  of  traversal  of 

a curve  is  taken  to  be  that  direction  of  traversal  placing  the  darker  region  to 

the  right  of  the  curve.  This  processing  induces  two  relations  on  the  set  of  high 

contrast  edge  points  E * (e.,e,,...,e  }.  F = {(e,,e,)  : e.  forward  links  to  e } 
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and  B » {(e^,e^)  : e^  backward  links  to  e^K  Note  that  (e^,e.)  e F does  not  mean 
that  (e^.e^)  e B.  This  symmetry  will  probably  exist  if  edge  points  e^  and  e.  are 
indeed  consecutive  points  on  the  same  boundary  segment.  However,  at  locations  of 
curve  junctions  or  poor  contrast,  the  edge  point  relationships  are  expected  to  be 
broken.  Linking  of  points  is  done  in  image  array  storage. 


2.1.3  Step  3:  collection  of  continuous  chains  of  edge  points 


If  (e^.e^)  e F and  (e^e^)  e B then  e^  and  ej  are  consecutive  points  on  a 
curve  segment.  All  the  high  contrast  points  can  now  be  placed  into  equivalence 
classes  (representing  curve  segments)  as  follows.  Define  the  relationship  R sucli 


that  (e^.e^)  e R if  and  only  if  there  is  a chain  of  forward  (backward)  links 
(possibly  a null  chain)  from  point  e.  to  point  e^  and  a chain  of  backward  (forward) 
links  from  point  to  point  e^.  R is  relexive,  symmetric  and  transitive.  Each 
equivalence  class  represents  a separate  curve  segment.  Curve  segments  can  then  be 
extracted  by  considering  each  point  of  the  high  contrast  set  (in  any  order)  and 


tracking  all  related  points  when  a beginning  curve  point  is  encountered.  A begin- 
ning curve  point  is  a point  et  such  that  if  (e^e^je  B then  (e^.e^^  F.  Tracking  of 
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curve  segments  is  done  in  image  array  storage.  The  image  is  raster-scanned  for 
beginning  points.  When  a beginning  point  is  found  the  chain  is  tracked  until 
broken.  Then  the  raster  scan  resumes  at  its  former  place.  Because  each  curve 
segment  has  but  one  beginning  point  there  is  no  duplication.  Chains  smaller 
than  some  fixed  number  of  points  are  suppressed  thus  removing  many  noise  edges. 

2.1.4  Examples  of  smooth  curve  segment  extraction 

Figure  1 shows  a light  airplane  on  a darker  airfield.  The  curve  ex- 
traction procedure  as  applied  to  a window  containing  the  right  wing  tip  is  illus- 
trated in  Figures  4 , 5 , and  6.  The  high  contrast  points  near  the  wing  tip 
and  their  gradient  directions  are  shown  in  Figure  4.  Figure  5 shows  a plot 
of  all  forward  and  backward  links  created  by  the  spiraling  neighborhood  searches 
of  step  2 of  the  process.  Notice  that  certain  points  are  of  degree  3 meaning  that 
they  are  at  the  junctions  of  multiple  edge  activity.  These  points  must  be  at  the 
terminus  of  an  extracted  curve  segment  because  they  cannot  relate  symmetr ically 
to  3 neighbors.  Large  sets,  or  chains,  of  symmetrically  related  points  are  shown 
in  Figure  t>.  A large  portion  of  the  wing  boundary  is  successfully  extracted 
along  with  two  edges  of  the  "USAF"  identification  interior  to  the  wing  and  two 
edges  of  a dark  streak  on  the  airfield  below  the  plane. 

Figure  7 shows  a photo  containing  curved  roads  in  rough  terrain  (lower  right 
corner).  The  high  contrast  points  from  a region  where  two  roads  intersect  are 
shown  in  Figure  8.  The  point  linking  relations  are  shown  in  Figure  9 and  the 


five  curve  segments  extracted  are  shown  in  Figure  10.  The  straight  curve  segment 
oriented  toward  225°  is  caused  by  a shadow  which  cuts  across  one  of  the  intersect- 
ing roads.  The  other  curves  are  from  the  road  edges  and  a midstrip  structure  at 
the  intersection. 

2.1.5  Discussion  of  smooth  curve  extraction 

The  curve  extraction  algorithm  has  been  used  to  support  a registration  pro- 
cedure which  matches  curve  segments  of  an  image  with  those  of  a map  or  model.  Seg- 
ments with  points  of  high  curvature  were  selected  and  measured  for  curvature  and 
typed  as  either  concave  or  convex.  These  features  of  the  extracted  curves  allowed 
for  selective  matching  to  curves  in  the  map  or  model.  Many  of  the  wing  tips,  tail 
tips,  and  nose  tips  of  a set  of  airplanes  were  extracted  and  used  for  registration 
in  this  manner.  Some  airplane  parts  were  missed  due  to  a fracturing  of  the  curves. 
There  were  similar  problems  with  the  terrain  imagery  due  to  shadows  or  low  contrast. 

i 

Higher  level  problem  specific  knowledge  must  be  employed  to  join  general  curve 
segments  to  form  the  boundary  of  recognizable  objects.  The  interpretation  of  the 
curves  depends  on  the  recognition  of  the  objects  and  visa  versa.  As  a simple  example, 
if  a set  of  curve  segments  map  onto  parts  of  an  airplane  model  under  the  same  RS&T* 
transformation,  an  appropriate  linking  of  curve  segments  for  forming  the  continuous 
boundary  is  immediately  suggested.  L.N.K.  has  been  successful  at  verifying  faint 
curve  segments  under  model  direction  and  has  thus  been  able  to  get  complete  boundary 
curves  for  modeled  objects  even  when  high  contrast  points  form  only  a partial  object 
boundary.  More  discussion  on  this  topic  follows  in  Section  4. 


*Rotation,  Scaling  and  Translation 
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INTEGER  DX,DY,COMPKT 

COMMON  / DELTAS /DX( 68) ,DY(68) ,C0MPMr(68) 

DATA  DX  / 1,1, 0,-1, -1,0,1, 

+ 2, 2, 2, 1,0, -1,-2, -2, -2, -1,0,1, 

+ 2, 3, 3, 3, 2, 1,0, -1,-2, -3, -3, -3, -2, -1,0,1, 

+ 2, 3, 3, 2, -2, -3, -3, -2, -1,0, 1.4, A, 4, 1,0, 

+ -1,-4, -4, -4, -4, -3, -2, 2, 3, 4, 4, 3. 2, -2, -3, -4/ 

C 59  45  46  47  60 

58  44  34  35  36  37  61 
57  43  33  18  19  20  21  38  62 

56  32  17  6 7 8 9 22  48 

55  31  16  5 * 1 10  23  49 

54  30  15  4 3 2 11  24  50 

68  42  29  14  13  12  25  39  63 

67  41  28  27  26  40  64 
66  53  52  51  65 


DATA  DY  /0, -1,-1, -1,0, 1,1,1, 

+ 1,0, -1,-2, -2, -2, -1,0, 1,2, 2, 2, 

+ 2, 1,0, -1,-2, -3, -3, -3, -2, -1,0, 1,2, 3, 3, 3, 

+ 3, 2, -2, -3, -3, -2, 2, 3, 4, 4, 4, 1,0, -1,-4, -4, 

+ -4, -1,0, 1,2, 3, 4, 4, 3, 2, -2, -3, -4, -4, -3, -2/ 

DATA  COMPMr/ 5, 6, 7, 8, 1,2, 3, 4, 15, 16, 17, 18, 19, 20, 9, 10, 11, 12, 13,1 
+ 29,30,31,32,33,34,35,36,21,22,23,24,25,26,27,28, 

+ 41,42,43,44,37,38,39,40,51,52,53,54,55,56,45,46, 

+ 47,48,49,50,63,64,65,66,67,68,57,58,59,60,61,62/ 

C COMP  Ml  DEFINES  180  DEGREE  ROTATIONS  COMPMl'(O)  IS  UNDEFINED 


Figure  2.  Definition  of  spiral  search  sequence 
through  the  neighbors  of  a pixel*. 
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Figure  5.  Plot  of  all  forward  and  backward  linking  relationships 

among  high  contrast  points  of  Figure  4 (curve  detection  step  2.) 
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2.2  Extraction  of  straight  edge  elements 


The  Hough  transformation  is  an  efficient  device  for  detecting  if  a set 
of  high  contrast  points  are  organized  along  a mathematical  curve  [Duda  1972]. 

The  simplest  mathematical  curve  and  the  most  important  one  for  detection  of  man- 
made structures  is  the  straight  line.  Only  two  parameters  are  required  for  spec- 
ification of  a given  line  - - in  polar  form  the  parameters  are  the  direction 
of  the  normal  to  the  line  (0)  and  the  distance  from  the  origin  to  the  line  lr). 

If  the  possible  line  directions  are  discretized  to  T values  and  the  possible 
distances  from  the  origin  are  discretized  to  K values  then  Hough  detection  is 
logically  equivalent  to  a matching  of  T ■ K templates  to  the  high  gradient  points 
[Stockman  1977]. 

2.2.1  Enhancements  to  the  general  Hough  transform 

Three  special  enhancements  were  made  by  L.N.K.  in  its  use  of  the  Hough 
transform.  First  of  all,  only  a small  percentage  of  the  high  gradient  points 
were  passed  to  the  Hough  detector.  This  was  achieved  bv  histogramming  t lie 
gradient  image  and  setting  a selection  threshold  such  that  2%,  V% , or  10%  of 
the  image  points  were  passed.  An  exception  to  this  policy  occured  if  the  thres- 
hold were  lower  than  an  estimate  of  the  standard  deviation  of  noise  gradients  in 
uniform  regions.  In  that  case  the  threshold  was  set  to  the  noise  level  gotten 
t rom  Interactive  training  on  uniform  regions  of  imagery.  Bv  using  only  a small 
percentage  of  the  strongest  edge  points,  only  the  strongest  edges  would  be  detect- 
ed while  weak  edges  or  noise  edges  would  he  suppressed.  This  had  the  ottect  of 
making  the  false  alarm  rate  for  detections  almost  0 while  the  false  dismissal  rate 
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A second  enhancement  made  by  L.N.K.  to  the  general  Hough  line  detector 
was  the  provision  for  refining  the  resolution  of  the  parameters  of  detected 
lines.  Coarse  detection  was  made  with  T=32,  i.e.  32  line  directions  were 
possible  in  11  1/4® increments,  and  R=17  since  only  2 pixel  wide  lines  within 
+ 16  pixels  of  the  center  of  a 60  x 60  window  were  considered.  Each  detection 
made  at  (0,r)  in  the  coarse  resolution  process  was  refined  as  follows.  A new 
set  of  T‘R=7:5=35  templates  were  established  with  2®  directional  resolution 
and  line  width  of  1 pixel.  The  parameter  space  tested  was  { 0-6 ,6-4 ,0-2 ,0 , 0+2, 
0+4,0+61  x l r-2, r-l , r , r+1 , r+2}  where  0 and  r were  the  parameters  gotten  from 
coarse  detection.  Directional  resolution  finer  than  2°  would  have  required 
windows  larger  than  60  x 60  pixels. 

The  third  enhancement  to  Hough  line  detection  was  rendered  by  checking  for 
compatibility  between  the  gradient  direction  at  the  high  contrast  point  and  the 
gradient  direction  of  the  candidate  template  tor  a line.  Typically  a high  con- 
trast point  was  considered  to  belong  to  at  most  5 of  the  possible  T • K lines.  The 
actual  number  of  possibilities  was  dependent  on  the  resolution  used  in  the  gradient 
extraction.  Thus,  the  use  of  gradient  information  further  increased  the  speed  of 
execution  and  at  the  same  time  sharpened  the  output  of  t he  detector. 

I 
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2.2.2  Examples  of  Hough  detection 

Once  the  previously  described  procedure  was  implemented  and  its  parameters 
were  tuned  to  the  imagery  at  hand,  a great  deal  of  testing  was  performed  witli  no 
further  changes  made  to  the  algor ithm.  In  several  cases  extracted  edges  were  used 
for  registration  or  object  detection  experiments  as  described  in  Section  4 of  this 
report . 

Figure  ll  shows  an  airfield  image  and  two  subimages  taken  from  it.  Coarse 
Hough  detections  made  on  the  subimages  are  shown  in  the  lower  right.  Each  straight 
edge  element  shown  Is  at  most  60/2  pixels  long  since  60  x 60  pixel  windows  were 
used  to  cover  the  Images.  Many  good  edge  elements  have  not  been  detected  as  a 
result  of  the  5%  point  selection  process  employed.  This  effect  is  particularly 
prominent  at  the  intersection  of  the  three  walks  where  widespread  edge  activity 
causes  the  highest  contrast  point  set  to  be  scattered  and  incapable  of  causing  a 
strong  response  in  any  single  template.  In  both  Images  some  edge  elements  over- 
shoot their  true  length.  This  is  because  the  responding  templates  are  plotted 
rather  than  just  the  points  inside  them.  In  general  further  clean  up  is  needed 
to  delimit  the  true  size  of  detected  edge  elements.  In  order  to  detect  all  nearly 
straight  edges  of  length  30  pixels  or  more  in  the  presence  of  noise  templates  2-pixels 
wide  were  used  and  the  detection  threshold  was  set  to  30  a priori.  Since  templates 
contained  roughly  120  pixels  roughly  1/4  of  its  points  had  to  respond  in  order  for 
the  template  to  respond.  Notice  in  the  lower  right  of  Figure  11  that  high  contrast 
points  on  the  short  side  of  one  building  triggered  two  templates  in  two  different 
60  x 60  windows. 
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Figure  12 


shows  a poor  image  of  an  airfield  and  the  resulting  coarse 
Hough  detections.  The  directional  resolution  of  the  edges  is  unsatisfactory. 

The  refinement  procedure  discussed  in  Section  2.2.1  produced  the  results  shown 
in  Figure  la.  Excellent  alignment  of  many  of  t lie  ref ined  edge  elements  allowed 
a simple  procedure  to  combine  short  edge  elements  into  long  ones  as  shown  in 
Figure  IS.  As  will  he  shown  in  Section  , the  partial  straight  edge  detection 
shown  in  Figures  and  IS  is  sufficient  to  establish  registration  with  a map 

and  thus  unlock  model  information  useful  tor  focused  reexamination  o!  parts  oi  tin' 
image.  The  poor  quality  GAFB  image  was  deliberately  chosen  to  illustrate  this 
point.  Before  passing  on  it  is  important  to  note  that  the  light/dark  relationships 
along  edge  elements  is  indicated  in  Figures  14  and  IS  while  they  are  not  evident 
in  Figures  11  and  12.  The  dark  side  of  the  edge  will  be  at  the  right  as  the 
edge  is  traversed  in  the  direction  oi  the  arrow. 

Figure  11  shows  a large  area  with  fine  detail.  Fine  resolution  Hough  detect- 
ions for  this  image  are  shown  in  Figure  It'.  Very  little  high  level  structure  is 
evident  in  Figure  In  and  perhaps  more  extraction  effort  should  have  been  invested 
— for  example, in  using  more  and  smaller  windows.  However,  the  edge  content  shown 
in  Figure  In  proved  to  be  sufficient  to  register  the  image  of  Figure  11  with  a 
map  made  from  Figure  7.  (The  detected  edge  directions  from  Figure  I 1 were  actually 
reversed  to  get  Figure  In  because  Figure  11  is  a negative  rather  than  positive 
as  is  Figure  7.) 
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2.3  Points  of  special  curvature 


It  has  been  known  for  a long  time  that  points  of  high  curvature  and  inflect- 
ion points  on  the  boundary  of  an  object  contain  most  of  the  Information  used  by 
humans  in  recognizing  the  object.  Such  points  also  play  an  important  role  in 
representation  of  an  object  in  a compact  form.  A summary  of  a good  deal  of  work 
in  this  area  can  be  found  in  [Pavlidis  1977]. 


Figure  17  shows  a tracing  of  a few  major  features  from  a 1:250,000  map  of 
the  Harrisburg,  Pennsylvania  region.  These  are  features  which  should  clearly  be 
evident  in  aerial  photography  and  perhaps  even  in  l.ANDSAT  imagery.  There  are 
several  points  whose  uniqueness  make  them  vital  to  recognition  or  registration  of 
the  region.  Some  of  these  points  are  intersection  points,  for  instance,  the  juncture 
of  the  Pa.  Turnpike  and  Route  15.  Perhaps  a dozen  good  points  of  high  curvature 
exist.  The  crooked  profile  of  Sherman  Creek  provides  the  greatest  opportunity  for 
recognition  or  registration  - - 10  points  of  high  curvature  are  available.  The 
Juniata  River  contains  interesting  bends  but  the  Susquehanna  does  not.  There  is, 
however,  a sharp-cornered  island  down  river  from  Harrisburg  which  has  prominent 
features.  A thin  resevoir  with  3 sharp  corners  is  evident  in  the  top  right  quadrant. 
As  will  be  shown  in  Section  4,  it  is  not  necessary  that  all  of  the  features  of  an 
image  be  recognized  before  the  image  itself  can  be  recognized.  It  is  also  not  nec- 
essary that  continuous  curves  be  extracted.  For  instance,  segments  of  Sherman  Creek 
are  likely  to  be  disconnected  as  the  creek  ducks  under  thick  foilage.  Thus,  only 
some  distinguishing  features  will  be  available  in  any  given  image  of  an  area,  but 
there  should  always  be  enough  for  recognition. 
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Besides  the  shape  Information  from  the  curve  In  the  neighborhood  of  a high 
curvature  point,  there  might  also  be  qualitative  information,  especially  if  t lie 
Imagery  is  mult ispectral . For  Instance,  the  points  on  the  small  streams  are  linear 
water  features  inside  a land/vegetation  background.  This  water  versus  land  quality 
can  be  picked  up  automatically  from  the  mult ispectral  signal.  The  corners  of  Hill 
Island  would  be  dot Ined  by  land/vegetation  jutting  out  into  open  water.  Boundary 
points  so  del  ined  by  local  shape  and  region  features  could  easily  be  extracted 
nut om.it  lea  1 1 v with  an  acceptable  degree  of  reliability  and  matched  to  a geographic 
data  base  tor  recognition  and  registration  purposes. 

Some  experimentation  was  carried  out  In  the  detection  of  high  curvature  boundary 

i 

points.  The  curve  extraction  procedure  of  Section  2.1  was  applied  to  imagery  to  pro- 
vince segments  ot  boundaries.  The  curvature  ot  each  boundary  segment  was  computed  bv 
a method  similar  to  that  in  algorithm  7.1  of  jPavlidis  IS77)  and  curve  segments  with 
points  ot  high  curvature  wore  identified.  All  other  boundary  segments  were  discarded 
t rom  this  process.  Figure  18  shows  five  "corners"  identified  in  a window  ot  the 
AFB  image  in  Figure  11.  The  window  contains  t ho  second  airplane  t rom  the  hot  tom 
and  the  wing  ot  the  tirst  airplane.  All  three  wing  tips  were  extracted  but  the  nose 
and  one  tail  tip  ot  the  complete  plane  were  missed.  Some  st  i uc  tut  vd  no  i se  was  a 1 so 
extracted.  It  should  be  clear  that  this  evidence,  along  with  othot  evidence  such  as 
straight  edge  content,  is  useful  for  recognizing  objects  and  determining  then  posi- 
t ion  and  the  scale  ot  the  imagery.  Further  treatment  follows  in  Section  . 


Other  primitive  image  features 


The  requirements  on  primitive  features  are  (1)  that  they  be  simply  defined 
and  (2)  that  they  can  be  extracted  automatically  from  imagery  with  acceptable 
reliability.  In  addition  to  the  primitives  mentioned  in  Sections  2. 1-2. 3 two 
others  are  presented.  These  primitives  were,  in  fact,  already  briefly  mentioned. 

The  intersections  between  two  line  or  edge  features  can  provide  very  good 
features  for  recognition.  Due  to  the  fact  that  edge  detectors  tend  to  be  unstable 
at  intersections  some  higher  level  (but  still  automatic  and  bottom-up)  decision- 
making is  required  to  extend  detected  boundary  segments  and  force  the  intersection. 

It  is  even  possible  to  create  imaginary  intersections  as  the  surveyor  does;  for 
instance,  to  create  the  intersection  of  the  wall  of  a building  (extended)  and  a 
•street.  Intersections  can  create  a local  topology  and  geometry  that  provides 
reliable  matching  to  a stored  representation.  Work  has  already  been  done  in  this 
area  by  [Zahn  1974]  and  [Dudani  1977].  Experiments  with  the  use  of  simple  inter- 
section features  is  discussed  in  Section  4. 

Boundary  segments  can  be  useful  features  even  though  the  segment  is  not  straight 
or  of  high  curvature.  The  boundary  may  be  significant  due  to  the  types  of  regions 
which  it  separates.  This  is  particularly  relevant  if  multispectral  imagery  is 
available  to  make  region  extraction  and  interpretation  a lower  level  process.  The 
fact  that  a boundary  segment  separates  land  and  water  regions  does  not  give  it  un- 
ique properties  for  matching  to  a reference,  especially  if  the  segment  shape  is 
bland.  However,  the  number  of  possible  matches  in  a reference  data  base  may  be  small, 
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and  global  consideration  of  several  such  ambiguous  features  could  yield  a unique 
match  between  imagery  and  reference.  A method  for  integrating  ambiguous  local 


matching  evidence  to  form  a unique  global  match  is  given  in  Section  4. 
labeling  is  another  technique  for  arriving  at  a global  interpretation 
local  interpretations  [Zucker  1976,  Tenenbaum  197bl. 


Relaxation 
from  ambiguous 


Figure  13.  IINBL  test  image:  repeat  coverage  of  naval  base 

in  Figure  7.  (Image  is  about  2000  x 2000  b-bit  pixels. 
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Figure  14. 


Coarse  Hough  detections  from  CAFB 
refined  to  2“  directional  resolution 
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Figure  L5.  Hough  detections  from  CIAFH  combined  to  form  long 
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3.  Recognition  of  structures  via  grammar  models 


Grammar  models,  originally  introduced  to  model  the  structure  of  language, 
have  maintained  the  interest  of  pattern  recognition  researchers  for  over  a 
decade.  Context  free  grammars  in  particular  allow  tractable  hierarchical  model- 
ing of  component  structure.  Initially  conceived  for  linear  strings,  grammars 
have  been  generalized  to  apply  to  2-D  as  well,  either  by  changing  the  grammar 
model  itself  [Shaw  1970]  or  by  analyzing  only  1-D  boundary  curves  in  an  image 
[Ledley  1966 ] . 


3.1  Background  and  motivation  of  a grammatical  approach 


There  have  been  many  efforts  in  linguistic  pattern  recognition.  The  work 
of  Shaw  [1970],  Pavlidis  [1977],  and  Fu  [1974]  are  exemplary  and  contain  much 
discussion  of  the  virtues  of  the  linguistic  approach.  Many  weaknesses  of  former 
linguistic  pattern  recognition  implementations  stem  from  the  fact  that  pattern 
recognition  researchers  did  little  to  tailor  linguistic  analysis  methods  to  the 
more  demanding  real  data  situation.  First  of  all,  most  implementations  commit 
themselves  to  unique  segmentations  in  "preprocessing"  stages  which  do  not  utilize 
available  structural  knowledge  and  thus  irrevocable  decisions  are  made  in  locally 
ambiguous  contexts.  Secondly,  implementations  have  been  one  directional.  Analysis 
is  either  done  in  a bottom-up  (data-directed)  or  top-down  (model-directed)  fashion 
but  not  in  both  directions.  Shaw's  PDL  analysis  scheme  was  top-down  with  primitive 
processing  always  done  under  model  hypothesis.  The  approaches  of  Pavlidis  and  Fu 
are  characteristically  bottom-up  with  early  commitment  to  unique  segmentations  in 
the  absence  of  structural  hypotheses. 
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by  identifying  reliably  extracted  primitive  components  of  the  model,  extracting  those 
in  preprocessing  without  structural  constraint  and  then  doing  model-directed  search 
for  remaining  pattern  structure.  Thus  analysis  can  proceed  in  either  the  bottom-up 
or  top-down  direction.  The  dichotomy  of  terminal  and  nonterminal  structures  is  retain- 
ed here  - - terminal  structures  are  recognized  only  through  primitive  feature  extract- 
ion on  real  imagery  while  nonterminal  structures  are  processed  only  in  the  higher 
level  syntactic/semantic  model  space.  However,  unlike  most  other  approaches,  recog- 
nition of  terminal  and  nonterminal  structure  is  overlapped  in  time  with  the  data 


processing  and  model  processing  providing  each  other  with  guiding  feedback. 

Section  3.2  outlines  the  theoretical  development  of  such  a non-directional  analysis. 
Section  3.3  describes  a simple  example  of  non-directional  analysis  in  the  detection 


of  rectangles  in  reconnaissance  imagery.  Primitive  extraction  is  treated  in  more 
detail  in  Section  3.4  and  a final  discussion  of  issues  follows  in  Section  3.6. 


3.2  Outline  of  a theory  for  non-directional  structural  pattern  recognition 


Due  to  space  limitations  complete  definitions,  proofs,  and  discussion  of  con- 
cepts cannot  be  included  here.  Instead,  certain  basic  background  is  assumed  and 
only  a broad  treatment  is  given.  Excellent  informal  treatment  of  problem  reduction 
representations  (PRR) , also  known  as  AND/OR  graphs,  and  state  space  representations 
(SSR)  can  be  found  in  the  text  [Nilsson  1971].  Formal  treatment  can  be  found  in  a 
paper  [VanderBrug  and  Minker  1975]  and  in  a dissertation  [Stockman  1977].  Also 
relevant  is  a paper  by  Hall  [1973]  showing  the  equivalence  of  a context  free  grammer 
(CFG)  to  a finite  AND/OR  graph,  and  a paper  by  Chang  and  Slagle  [1971]  showing  that 
conversion  can  be  made  from  PRR  to  SSR  so  that  the  A*  algorithm  can  be  used  to  pro- 
duce solutions  of  AND/OR  graphs.  The  practical  result  of  integrating  this  work  is 
as  follows.  Structural  constraints  on  real-world  objects  can  be  modeled  by  a CFG  or 
its  equivalent  PRR.  Recognition  of  the  object  then  amounts  to  parsing  data  using 
the  PRR.  Recognition  results  in  a parse  tree  (CFG)  or  a solution  tree  (PRR)  which 
is  a hierarchical  breakdown  of  each  object  structure  in  terms  of  its  components  re- 
cognized in  the  data. 


In  order  to  effect  an  efficient  non-d i roc tiona l analysis  special  embollish- 


ments  are  appended  to  the  usual  PRR.  First  of  all,  AND  successors  are  ordered 
and  are  searched  for  sequentially  and  only  after  all  previous  successors  of  the 
set  are  solved.  For  example  if  problem  A is  solved  by  solving  both  problems  B 
and  C,  only  one  of  the  subproblems  B or  C will  be  posed  at  a time.  There  is  no 
sense  wasting  effort  to  solve  B if  C is  unsolvable.  This  strategy  was  used  in 
a top-down  parser  by  Chartres  and  Florentin  1 IdpH)  . l'ho  first  AN  0 successor  of 
a set  of  subpvoblems  of  problem  F is  called  a pr ima rv  sue cosso i of  problem  P. 

Fverv  OR  sucessor  o I problem  P is  called  a primary  successor.  A primary  descendant 
of  the  root  problem  K is  either  a primary  successor  of  R or  the  primary  successor 
of  some  primary  descendant  of  R.  In  the  linguistic  pattern  recognition  context, 
primary  terminals  are  key  primitives  or  prominent  features  which  can  be  reliably 
detected  without  syntactic  constraint.  Recognition  of  a primary  problem  would 
then  trigger  the  search  for  the  solution  to  problems  which  have  the  solved  problem 
as  primary  successor.  Search  for  this  solution  won  Id  typically  involve  a top-down 
search  for  the  solution  of  other  non-primary  successors.  If  the  inverse  of  the 
primary  successor  relation  is  available  in  the  PRR,  analysis  can  proceed  recursively 
in  either  bottom-up  or  top-down  direction.  CFG ’ s (hence  finite  AND/OR  graphs)  are 
easily  inverted  for  bottom-up  analysis.  If  A -BC  (problem  A is  solved  bv  solving 
both  problems  C and  B)  and  1H-EBG  then  goals  A and  D should  be  initiated  if  a solution 
to  structure  H were  at  hand.  Separate  (parallel)  model-directed  searches  would  then 
he  done  for  the  solution  of  successor  C of  A and  the  higher  priority  of  successors  F 
and  G of  D . 
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In  [Stockman  1977]  a conversion  Is  made  from  PRR  to  SSR  which  has  the 
following  properties. 

(1)  PRR  has  a solution  graph  if  and  only  if  SSR  has  a solution  path. 

(2)  All  solutions  to  PRR  can  be  found  in  a top-down  mode  by  search 

of  SSR  with  the  initial  state  encoding  the  root  problem  of  PRR. 

(3)  All  solutions  to  PRR  can  be  found  in  a bottom-up-top-down  mode 

by  search  of  SSR  with  a set  of  initial  states,  each  one  encoding 

some  solved  primary  descendant  of  the  root  problem. 

Any  of  the  standard  search  algorithms  of  SSR  [see  Nilsson  1971]  will  do — 
depth-first,  breadth-first,  or  ordered  search.  In  applications  discussed  below 
a heuristic  function  evaluating  the  merit  of  partial  solutions  was  used  which 
enabled  A*  search.  Note  that  bottom-up  initiation  of  search  (point  3 above)  re- 
quires that  PRR  have  a finite  set  of  primary  primitives,  which  is  the  case  with 
a CFG. 

3.3  An  experiment  in  the  recognition  of  rectangles 

In  this  section  a simple,  but  non-trivial  example  is  given  of  the  non-direct- 
ional  analysis  algorithm  outlined  in  Section  3.2.  Actual  computer  runs  on  real 
and  simulated  data  have  been  made  and  have  demonstrated  the  capabilities  of  the 
analysis  paradigm.  The  non-directional  analysis  algorithm  was  first  implemented 
as  the  structural  component  of  a waveform  parsing  system  [Stockman  1977]  and  was 
rigorously  studied  in  the  recognition  and  measurement  of  pulse  waves.  The  identical 
structural  component  was  then  applied  to  the  recognition  of  rectangular  objects 
in  images  as  described  below.  The  transition  from  1-D  to  2-D  data  was  enabled 
by  the  system's  treatment  of  locational  information  as  attributes  of  structures. 


Problem  specific  procedural  semantics  were  necessary  to  handle  attribute  manipu- 
lation in  each  application  and  were  coupled  to  the  structural  analysis  in  a uni- 
form way. 

3.3.1  The  experimental  data 

Figure  19  shows  the  simulated  experimental  data.  input  to  the  recognition 
system  is  a set  of  undirected  edge  elements  each  specified  by  two  points.  The 
data  is  rough  because  no  complete  contours  exist  and  there  are  gaps  and  changes  in 
orientation  along  the  sides.  Complex  corners  could  fool  ordinary  tracking  algorithms 
This  data,  however,  is  probably  better  than  can  be  expected  from  preprocessors  in 
many  applications.  Generally  it  should  not  be  expected  that  edges  sufficiently 
characterizing  object  structure  can  be  delivered  by  model-independent  preprocessing. 
Suppose,  for  instance,  that  edge  element  9 was  quite  faint  in  the  image.  Globally 
parameterized  edge  detectors  would  then  not  deliver  that  edge  element.  There  is, 
however,  a solution  to  this  problem  in  model-directed  local  edge  detection.  Suppose 
that  the  sides  DA,  AB,  and  BC  of  rectangle  ABCD  were  recognized  at  a certain  point 
in  the  analysis.  At  that  point  edge  CD  could  be  hypothesized  and  the  image  scanned 
under  lenient  parameters.  The  data  of  Figure  19  should  therefore  be  regarded  as  a 
union  of  two  sets  of  edges,  those  primary  edge  structures  detected  under  stringent 
global  parameterization  and  those  secondary  edge  structures  detected  locally  under 
lenient  parameterization.  In  the  actual  computer  runs  edge  elements  // 7 and  #12  were 
used  as  primary  edge  structures,  but  arbitrary  choices  could  have  been  made  for  start 
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Figure  21.  A Context  Free  Grammar  (CFG)  for  rectangles 
corresponding  to  PRR  of  Figure  20. 
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upon  the  quality  of  detection  and  upon  the  degree  to  which  the  detection  satisfies 
the  structural  hypothesis.  The  quality  of  a non-primitive  structure  is  defined 
as  the  minimum  quality  of  its  substructures.  This  definition  might  not  create  the 
"best”  recognition  procedure  but  it  does  create  an  admissible  search  for  the  best 
interpretation.  The  merit  of  a path  in  model  space  is  defined  as  the  minimum  quality 
of  any  structure  recognized  along  that  path.  The  ordered  search  for  interpretations 
will  thus  find  the  highest  quality  one  first  because  it  always  extends  the  highest 
merit  path  first. 

3.3.4  An  example  of  processing 

The  non-directional  algorithm  was  started  on  the  data  of  Figure  19  with  the 
syntactic  binding  STRAIT  = (13,22)  - (8,18),  that  is,  the  primary  terminal  of  the 
grammar  was  identified  to  be  the  straight  edge  element  directed  from  point  (13,22) 
to  point  (8,18).  Significant  states  of  the  resulting  state  space  search  are  describ- 
ed below.  Each  state  is  a partial  parse  tree  and  has  a merit  computed  from  the  rec- 
ognized terminal  structures  in  it.  By  state  #3  the  <PRIM>  structure  is  recognized 
and  the  grammar  immediately  causes  three  states  to  be  generated,  one  each  to  search 
for  <ALT2>,  <ALT3>,  and  <LFRC>  respectively.  The  <ALT2>  alternative  attempts  to  ex- 
tend the  side  backward  while  the  <ALT3>  alternative  attempts  a forward  extension. 

The  <LFRC>  alternative  sets  the  goal  of  finding  a second  side  at  a 90°  bearing  from 

the  first.  In  states  4 to  10  the  <LFRC>  alternative  is  pursued  but  no  such  perpen- 
dicular side  exists  and  the  search  path  deadens.  <ALT3>  does  succeed  in  a forward 
extension  of  <PRIM>  to  point  (5,16).  In  so  doing,  the  merit  of  states  on  this  path 

drops  from  1.0  to  0.9.  This  enables  <ALT2>  to  be  pursued  in  states  13  and  14  which 
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produce  no  backward  extension.  The  new  <PR1M>  structure  recognized  from  point 
(13,22)  to  (5,1b)  once  again  causes  3 alternative  search  goals  to  be  set  — forward, 
backward,  and  perpendicular  extension.  States  19  to  48  pursue  a perpendicular 
extension  from  point  (5,1b)  to  (12,7),  but  point  (12,7)  Is  a dead  end  since  no 
further  extension  Is  possible.  At  state  53  another  open  path  is  picked  up  and  a 
perpendicular  extension  is  driven  from  point  (9,11)  to  point  (17,17).  Thus  by 
state  72  three  sides  of  rectangle  OABO  are  recognized,  by  state  97  a path  is 
driven  perpendicular  to  side  BC  to  point  (10,28)  thus  completing  the  recognition 
of  U perpendicular  sides.  However,  the  path  overshoots  the  correct  beginning 
point  of  the  rectangle  and  becomes  dead  due  to  a semantic  check  on  the  sizes  ol 
sides  2 and  4.  An  alternate  open  path  is  pursued  to  final  state  10b  causing 
recognition  of  rectangle  DABC . Two  open  paths  remain  by  state  10b  representing 
paths  D to  (3,14)  and  0 to  (5,1b)  to  (9,11)  to  (15,15)  respectively  but  can  not 
develop  into  recognition  of  other  rectangles. 

A more  detailed  presentation  of  the  search  is  now  given.  At  any  stale  of 
the  search  there  may  be  one  or  more  partial  matches  ot  the  model  (Figure  20)  with 

the  data  (Figure  19).  Fach  partial  match  is  rated  for  its  quality  and  this  rating 

is  used  to  determine  which  analysts  is  extended  in  the  next  search  state.  The  ex- 
ample search  was  started  with  state  If  1 rated  as  1.0  and  encoded  as  follows. 

5 5 

#.  I 13,22,8,181 
1 1 

The  meaning  of  this  encoding  is  that  structure  5,  i.e.  the  STRAIT  structure  in  t ho 
rectangle  model  of  Figure  20,  has  been  recognized  spanning  points  (1  1,22)  and  (8,181 
ot  the  image.  Structure  5 is  substructure  1 ot  its  parent  structure.  The  dot 
to  the  left  of  the  bracket  "5"  indicates  that  processing  is  tovusod  on  st rue  lure  5. 

5b 


Using  the  model  the  search  algorithm  recognizes  that  structure  2 exists  and 


generates  state  #2  encoded  as 


2 5 5 2 

It  . | 13, 22, 8, 181  13.22,8.18)  ) 
1 1 11 


and  also  rated  at  1.0. 


Recognition  of  structure  2 implies  recognition  of  structure  1 at  state  It 3 of  the 
analysis.  Special  processing  indicated  in  the  model  (but  not  indicated  in  Figure  20) 
causes  the  state  encoding  to  be  collapsed  into 


1 1 

It  . I 13,22,8,18  ] . 

1 1 


The  model  Indicates  that  the  <I’R1M  > structure  1 can  be  part  of  3 different 
superstructures.  Thus  at  state  i/4  of  the  analysis  there  are  three  competing  par- 
allel interpretations  encoded  as  follows. 

8 1 18 

It  ( . | 13,22,8.18  1 ) 

11  11 

4 1 14 

il  ( . I 13,22,8.18  ) ) 

11  11 

3 1 13 

//  ( . | 13,22,8,18  ) ) 

11  11 


I'he  I list  interpretation  has  structure  1 as  the  complete  first  side  of  the  rectangle. 
1 he  second  and  third  alternatives  see  structure  1 as  an  incomplete  side  that  must  In 
extended  in  the  forward  or  backward  direction.  All  three  alternatives  art-  rated  1.0, 


5) 


the  top  one  is  taken  for  expansion  next  in  the  search.  Using  the  model, the  search 
generates  the  following  encodings  in  a top-down  manner.  Note  the  90"  direction  change 
as  specified  in  the  model  for  searching  for  side  2 with  respect  to  side  1. 


SI  18 

# ( . I 13,22.8,  IS  ] ) 

11  11 

S 9 9 1 18 

* ( ■ ( S, IS, 11, 12  1 [ 13,22.8,18  ] 1 

12  2 1 11 


8 9 


13 


13  9 1 


1 8 


it  ( ( 8,18,11,12  . ( 8,18,11,12  1 ) | 13,22,8,18  ] ) 


1 2 


1 2 1 


1 1 


8 9 13  10  .10  13  9 1 1 8 

it  ( ( 8,18,11,12  ( 8,18,11,12  . ( 8,18,11.12  ) ) ) | 13,22,8,18  ] ) 
12  l 1 112  1 11 


DEAD 


This  line  of  analysis  deadens  because  structure  10  is  a primitive  straight  line 
structure  for  which  there  is  no  above  threshold  evidence  in  the  data.  An  alternate 
course  of  analysis  is  thus  pursued  as  follows. 


A 1 1 A 

( . [ 13,22,8,18  ] ) rated  1.0 


1 1 


1 1 


7 l 


1 A 


it  ( 13,22,8,18  . ( 8 , 18 , 3 , 1A  ) | 13,22,8,18  1 ) rated  1.0 


1 


1 


1 1 


A 7 7 1 1 A 

it  ( 13,22,8,18  . [ 7,17,5,1b  J [ 13,22,8,18  1 ) rated  0.9 


The  < PRIM  > structure  has  been  extended  forward  to  point  (5,16)  but  at  the 
expense  of  shooting  a gap  : hence  the  rating  is  reduced  in  proportion  to  the  gap 
size  to  0.9.  Structural  alternative  < ALT2  > is  pursued  temporarily  because  of 
higher  rating  1.0  but  after  failure  the  lineof  analysis  just  above  is  again  taken 
up  as  the  highest  rated  alternative.  A few  of  the  encodings  along  the  path  to  a 
correct  recognition  are  as  follows. 


AFTER  RECOGNITION  OF  TWO  SIDES  DA  AND  AB 


8 

9 

9 1 

1 

8 

It  ( 5, 

,16,9,11  . [ 

5,16,9,11  ] [ 

13,22,5,16  ] 

) rated  0.9 

1 

2 

2 1 

1 

1 

AFTER 

RECOGNITION 

OF  THREE  SIDES  DA,  AB,  AND 

BC 

' 

\ 


I 


8 9 99  91  18 

It  ( 9,11,17,17  . [ 9,11,17,17  ] | 5,16,9,11  J [ 13,22,5,16  ] ) rated  0.9 

1 3 3 2 2 1 1 1 


AFTER  PROPOSING  SEARCH  FOR  FOURTH  SIDE  CD. 

8 9 99  99  9 1 18 

It  ( 9,11,17,17  . ( 17,17,11,25  ) [ 9,11,17,17  ] [ 5,16,9,11  J [ 13,22,5,16  ] ) rated  0.9 

l A 4 3 3 2 2 1 1 1 


AFTER  RECOGNIZING  ENTIRE  RECTANGLE 


8 9 99  99  91  18 

It  . I 17,18,14,22  [ 17,18,14,22  ] [ 9,11,17,17  ] [ 5,16,9,11  ] [ 13,22,5,16  ) ) rated  0.9 

1 4 43  32  21  11 


The  automata  that  manipulates  such  encodings  to  perform  the  analysis  is  detailed 
in  [Stockman  1977].  The  manipulation  of  the  search  areas,  or  intervals  was  done  in 

associated  "semantic”  routines.  This  was  necessary  so  that  the  overall  problem  solving  J 

mechanism  would  work  uniformly  on  waveform  and  image  data. 
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3.3.5  Discussion  of  rectangle  recognition  experiments 


The  <RECT>  goal  in  the  rectangle  PRR  was  not  actually  used  as  the  root  problem 
in  the  experiments.  <1.FRC>  was  used  instead  to  control  only  a counter  clockwise 
search.  <RCRO  controls  clockwise  search  and  was  not  used  to  save  time.  It  was 
the  intention  that  in  future  work  the  confidence  measure  would  be  allowed  to  increase, 
and  thus,  while  a path  may  block  in  one  direction  due  to  noise  or  distortion  it  may 
be  found  in  the  reverse  direction  after  enough  confidence  has  been  built  to  overcome 
the  noise. 

A simple  primitive  detection  module  was  programmed  so  that  edge  elements  as 
pictured  in  Figure  11  could  be  extracted  from  grey  scale  images.  The  detector 
was  used  to  verify  the  existence  of  an  edge  element  as  predicted  by  the  grammar. 

The  primary  edge  element  STRAIT  had  to  be  recognized  by  other  means.  The  detector 
scanned  across  a hypothetical  edge  and  recorded  points  of  maximum  gradient  magni- 
tude. These  points  were  then  fit  with  a straight  line  to  assign  a confidence  value 
to  the  hypothesis.  The  system  was  then  tried  on  two  of  the  rectangular  buildings 
in  the  GAFB  image.  The  searches  were  successful  and  were  less  bushy,  i.e.  move 
efficient,  than  those  on  the  constructed  example  of  Figure  19.  However,  too  much 
adjustment  was  necessary  to  make  the  process  work  and  little  generality  can  he 


i.4  More  on  the  detection  of  shape  features 


-T-V 


Section  2 of  this  report  discussed  the  detection  of  primitive  features  in 
imagery  without  use  of  a priori  model  information.  Straight  edge  elements,  smooth 
curve  segments,  and  points  of  high  curvature  on  them  were  discussed  as  useful  feat- 
ures. It  was  argued  that  many  image  features  could  be  detected  by  cheap  implementat- 
ions although  a fair  portion  of  existing  features  might  be  missed.  Once  enough  image 
features  are  assembled  hypotheses  about  the  remaining  image  content  can  be  raised  and 
tested  by  using  models  such  as  PRR.  Verification  of  a hypothesis  is  much  more  effi- 
cient than  searching  for  a primary  primitive  without  modeL  guidance  and  thus  more  re- 
fined shape  features  can  be  afforded. 


; 


Verification  of  the  presence  of  a boundary  segment  of  a particular  shape  can 
be  done  using  curve  fitting,  Hough  detection,  or  searching  for  points  of  a prototype 
set  of  points.  These  techniques  are  discussed  in  some  detail  in  Section  5.  An  ex- 
periment in  the  recognition  of  the  curved  tip  of  an  airplane  wing  is  described  briefly 
here.  It  was  easy  to  add  a parabolic  curve-fitter  to  the  existing  straight  line  fitter 
used  in  the  rectangle  experiment.  As  before,  a specially  shaped  boundary  curve  was  hypothe- 
sized in  a given  region  of  the  image  with  a given  orientation.  Profiles  were  searched 
aligned  with  the  hypothesized  coordinate  system  and  the  high  gradient  points  were  fit 
with  a parabolic  curve.  Goodness  of  fit  determined  the  confidence  in  detection. 

Figure  22  gives  a PRR  model  for  the  wing  of  an  airplane  of  the  type  found  in 
the  AFB1  image  of  Figure  3.  Figure  23  shows  the  geometric  structure  modeled  in 


the  PRR . (Curvature  constraints  for  the  CAP  and  CUP  and  length  constraints  for  the 
sides  are  omitted  from  Figure  22).  The  curvature  parameter  of  the  parobolic  fit 
will  be  positive  or  negative  as  the  wing  boundary  of  3.5  is  traversed  clockwise  or 
counter  clockwise  respectively. 


Search  using  the  airplane  wing  PRR  was  never  correctly  carried  out.  The  major 
problem  was  that  1-D  fitting  techniques  were  used.  The  fitter  was  confused  by  the 
many  points  existing  at  the  juncture  of  the  straight  edge  and  the  parabolic  tip. 

These  points  had  the  same  x value  but  different  y values:  the  profile  scanning  could 
only  select  one  point  for  each  x value.  2-D  curve  fitting  was  clearly  indicated  but 
the  experimentation  was  halted  in  order  to  pursue  more  promising  avenues  as  discussed 
in  Section  4. 


3.5  Discussion  of  the  use  of  grammar  models 


Shaw  [1970]  realized  the  value  of  using  grammar  modules  for  2-D  analysis.  He 
was  also  aware  of  the  difficulty  in  doing  segmentation  as  preprocessing.  The  work 
reported  here  attempted  to  extend  Shaw's  work  in  two  directions.  First  of  all,  bottom- 
up  operators  were  employed  and  secondly,  multiple  competing  parses  were  developed. 


Martelli  [1976]  exploited  the  search  concept.  His  approach  is  to  dynamically 
search  image  data  for  continuing  edge  elements.  A priori  shape  information  was  en- 
coded in  heuristic  functions.  Martelli  has  even  experimented  on  the  same  rectangle 
detection  problem  discussed  in  this  paper.  No  structural  hierarchy  was  used  and  all 


<PR<*> 


3.2 

so 


ri&u<£  *•* 
sot****" 


tLi***T 

r 1 uto  «* 


,F  st*ai**t 

0».rtcr2l>) 


, TRAIT 


<CWG> 


k poo  ^ *“UCtUre  of  311  ^pUne  wing  modeled 
bv  PRR  of  Figure  22.  (Arrows  indicate  order  of 
edge  traversal  and  not  light/dark  relation.) 


operations  of  the  searcli  were  in  terms  of  single  pixel  extensions  of  the  path.  This 
study  generalized  the  work  of  Martelli  by  using  larger-than-pixei  primitives  and  by 
using  paths  of  a grammar  to  remember  and  control  the  shape  ot  search  paths  in  the 
data . 

Miller  [1973|  used  both  bottom-up  and  top-down  operations  in  a small  speech 
recognition  system,  coining  the  term  "island  of  reliabilltv"  for  what  are  called 
"primary  primitives"  in  this  report.  Griffith  |1973|  also  did  bottom-up  followed  by 
by  top-down  processing.  Neither  Miller  nor  Griffith,  however,  presented  a general 
algorithm  for  carrying  out  such  processing.  This  was  the  essential  contribution 
of  [Stockman  1977|  and  the  topic  of  this  section. 

Few  would  argue  that  vast  amount*  <>i  a priori  knowledge  are  not  necessary  for 
real  world  scene  analysis.  by  const  rue ; i ng  a PRK  certain  a priori  knolwedge  is 
available  for  a uniform  recognition  process.  Semantics  coupled  to  the  nodes  ot  tlu 
PRR  can  help.  But  a finite  PRR  is  equivalent  to  a CFG  and  so  it  is  well  known  that 

we  have  a tractable  but  limited  tool.  Put  Liter  work  is  necessary  t o test  the  viability 

ot  this  non-d irect ional  analysis  technique  on  complex  recognit  ion  problems.  Items 
learned  from  the  current  research  follow. 

rhe  1-1)  curve  fitting  technique  used  in  this  research  and  to  he  detailed  in 
Section  5 should  be  replaced  with  a 3-1)  technique.  It  .’-I)  curve  t itt  ing  is  too  ex- 
pensive or  unreliable  then  Hough  curve  detection  should  be  tried.  In  general,  quieket 

techniques  with  (ewer  parameters  should  be  used  in  verifying  hypotheses:  most  ot  the 


recognition  information  is  contained  in  the  structural  constraints.  Constraints 
other  than  those  on  shape  should  be  incorporated  into  the  PRR;  for  instance,  a 
given  region  could  be  tested  for  a hot  spot  assuming  that  an  IR  image  is  also 
available,  or  a given  region  could  be  tested  for  color  if  color  imagery  were  being 
used.  PRR's  that  model  loosely  related  composites  should  be  tried.  For  instance 
we  might  have  <AIRFIELD> — > <BLDG  GROUP>  <PLANE  GROUP>  < RUNWAYS > where  an  airfield 
is  recognized  as  a composite  of  planes,  buildings,  and  runways.  More  thought  has  to 
be  spent  in  assigning  the  confidence  value  to  < AIRFIELD>  given  the  confidence  values 
assigned  to  the  components. 


u. 


Iconic  shape  models 


This  section  examines  the  matching  of  arbitrary  shapes  extracted  from  an 
image  with  shapes  stored  in  a reference  data  base  (ROB).  There  was  considers! le 
discussion  of  geometric  shape  features  in  Section  2.  The  term  "iconic"  is  used 
to  indicate  that  tin*  image  structure  "looks  like"  1 lie  structure  stored  in  the 
RUB.  Mathematically  we  might  define  that  structure  A looks  like  structure  B if 
there  exists  an  RS&T  transformation*  mapping  points  of  A onto  points  of  B.  That 
is  in  tact  the  formalism  that  is  used  in  this  section.  Some  lenience  must  he 
allowed  in  tin*  paradigm  so  that  (1)  some  rubber  sheet  distortion  is  permitted  and 
(-)  missing  or  additional  parts  are  permitted  in  the  structured  Image  points.  The 
I irst  facility  can  be  provided  by  use  of  approximate  RS&T  mappings  which  tolerate 
some  distortion  and  the  second  facility  can  be  provided  by  use  of  a tolerant  partial 
matching  procedure. 

There  are  two  specific  applications  to  which  this  discussion  is  oriented.  First 
ot  all  we  want  to  be  able  to  register  aerial  imagery  with  an  existing  geographic 
(cartographic)  data  base  (GDR) . To  do  so  we  need  to  match  image  features  with  their 
icons  stored  in  the  GDB . The  result  of  a correct  global  matching  of  each  image  feat- 
ure to  a corresponding  GDB  feature  is  an  RS&T  transform  which  registers  all  points 
of  the  image  to  the  GDB  coordinate  system.  As  a result  of  the  registration,  the 
complete  Image  can  he  examined  for  content  in  comparison  to  the  iconic  feature  con- 
tent stored  in  the  GDB.  Updates  to  the  GDB  can  be  made.  For  instance,  rivers  can  be 
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searched  in  the  image  for  the  appearance  of  new  bridges  or  airfields  could  be 
searched  for  the  disappearance  of  planes.  In  a second  case  we  want  to  be  able 
to  recognize  generic  moveable  objects  by  matching  image  edge  structure  to  iconic 
edge  structure  stored  in  a model  (IHM  for  iconic  edge  model).  For  example,  the 
edge  structure  of  a B-52  airplane  could  easily  be  stored  in  the  same  manner  that 
a road  network  is  stored  In  a CDll.  The  important  distinctions  are  that  (1)  the 
object  Is  not  unique  but  may  exist  in  any  number  of  copies  having  identical  geo- 
metry, (2)  tiie  object  can  appear  and  disappear  at  many  earth  locations  during 
certain  time  lapses,  and  (3)  the  location  and  orientation  of  the  object  is  not 
known  a priori  even  after  imagery  is  registered  with  a GDB. 

4.1  Registration  of  image  structures  to  models 

Image  registration  and  object  detection  are  treated  below  as  two  sides  of  the 
same  coin.  A technique  is  developed  for  matching  image  structures  with  iconic 
structures  in  a stored  map  (GDB)  or  model  ( 1 KM) . Abstractly  there  is  no  difference 
between  a map  and  a model  and  the  algorithms  discussed  in  4.2  make  no  distinction 
between  the  two.  Since  registration  and  object  detection  have  traditionally  been 
distinct  concepts  background  work  is  discussed  under  separate  categories. 

4.1.1  Image  registration 

The  problem  of  registering  information  from  one  image  to  that  ot  another  image 
or  map,  possibly  made  at  a different  time  and  from  a different  perspective,  is  one 


i 
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of  the  major  problems  in  image  processing.  There  are  many  applications.  In 
medicine  there  is  the  problem  of  comparing  two  x-rays  made  a year  apart.  In 
industrial  engineering  there  is  the  problem  of  inspecting  an  assembly  of  parts 
to  check  conformity  with  a blueprint.  In  photo  interpretation  there  are  the 
problems  of  collating  information  about  a single  point  from  a set  of  images  from 
different  sensors  and  of  analyzing  a given  area  for  changes  over  time. 


Mathematically  formulated,  the  problem  of  registering  two  images  is  the  pro- 
blem of  determining  a transformation  T that  maps  arbitrary  point  in  the  first 
image  coordinate  space  to  corresponding  point  P?  in  the  second  image  coordinate 
space.  The  work  discussed  in  this  report  assumes  flat  imagery  and  linear  trans- 
formations; that  is,  points  have  2 coordinates  (x,y)  and  T is  specified  by  a rotat- 
ion 0 and  translation  (xs,ys).  In  general  points  may  be  specified  by  more  than  2 
coordinates  and  transformation  T could  have  many  parameters  if  non-linear  warping 
is  required. 


A straightforward  registration  technique  is  to  use  human  selection  of  corres- 
ponding "control"  points  in  the  two  images.  These  points  usually  represent  salient 
features  such  as  the  corners  of  buildings  or  the  intersection  of  roads  or  streams. 
Let  transformation  T be  defined  by  a parameter  vector  a and  Let  the  set  of  corres- 
ponding points  be  { (Pn,P21^’  ^P12,P22^  ’ ' ' ‘ ’ ^Plk,P2k^  ’ The  best  transformation 
T mapping  the  first  image  into  the  second  can  be  defined  as  that  transformation 


such  that 


e=£  d (Ta(p^ i^  ,P2i^  is  minimized 
i=l,k 


where  d“  is  the  squared  distance  between  the  control  point  p£.  in  the  second  image 


and  Che  corresponding  control  point  p transformed  into  the  second  image  coordi- 
nate space.  Thus,  once  corresponding  control  points  are  chosen  classical  least 
squares  fitting  can  produce  the  "best"  transformation  to  be  used  to  register  all 
points . 

Automation  of  control  point  selection  automates  the  entire  registration  pro- 
cedure. Van  Wie  and  Stein  [1977]  have  reported  on  a system  which  brings  ERTS-LANDSAT 
imagery  into  registration  with  UTM*  maps.  Map  control  points  are  human  selected  and 
so  are  the  control  points  of  the  first  imagery  processed.  Binary  gradient  masks 
in  the  neighborhood  of  each  image  point  P^  are  computed  and  stored  with  the  map 
control  points  ?2^.  When  subsequent  imagery  of  the  area  is  obtained  via  the  satel- 
lite the  stored  gradient  masks  are  correlated  with  the  gradient  image  in  order  to 
locate  the  points  P^.  This  technique  depends  on  the  justifiable  assumption  that 
the  registration  transformation  for  this  problem  is  approximately  known  so  that  the 
correlation  can  be  done  in  a focused  manner,  i.e.  the  orientation  and  neighborhood 
for  correlating  the  masks  is  approximately  known.  According  to  Van  Wie  and  Stein 
this  automatic  control  point  selection  goes  well  about  80%  of  the  time  and  requires 
human  intervention  about  20%  of  the  time  because  of  weak  correlation. 

The  work  of  Horn  and  Bachman  [1977]  deals  with  the  registration  of  an  aerial 
image  with  a synthetic  image  constructed  from  a model  of  the  light  source  (sun)  and 
ground  elevation.  The  best  transformation  T^  is  gotten  by  hill-climbing  (optimizat- 
ion) in  the  a-parameter  space  to  maximize  a criteria  function  which  considers  all 
points  corresponding  in  the  two  images  and  not  a selected  set  of  control  points. 

Such  a procedure  depends  on  having  a good  approximation  of  the  optimum  a to  begin 
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the  optimization  and  requires  a large  computational  effort  at  each  of  its  steps 
in  order  to  determine  fitting  quality  of  T . Because  it  is  a global  procedure 
and  requires  no  specially  selected  features  it  should  be  robust  in  performance. 

Some  middle,  ground  has  been  explored  at  SRI  by  (Barrow  et  al  1977].  An  image 
is  characterized  by  a sparse  set  of  feature  points  which  can  be  automaticaiiy  ex- 
tracted. Points  along  lineal  features  of  good  contrast  are  recommended  to  represent 
the  image.  Similarly,  points  of  lineal  features  (i.e.  a coastline)  are  represented 
in  a map.  Hill-climbing  optimization  from  an  assumed  approximate  a is  used  to  deter- 
mine the  transformation  T which  is  optimal  for  matching  points  in  the  image  with 

CL 

points  of  the  map.  Determining  the  quality  of  fit  for  a given  T is  aided  by  a tech- 
nique known  as  "chamfer  matching".  Computationally  this  method  is  far  more  palatable 
than  that  of  Horn,  but  it  does  rely  on  feature  detection  and  does  not  accurately  evalu 
ate  the  fit  of  T due  to  the  chamfering  trick.  Faster  but  perhaps  less  reliable  result 
should  be  expected. 

Section  A. 2 of  this  report  presents  a novel  approach  to  the  registration  of 
images  using  straight  edge  content.  Clusters  are  formed  in  an  a-parameter  space 
by  d ijstr_ibuting  points  gathered  from  local  evidence.  The  parameter  space  is  the 
space  to  be  "searched"  for  the  optimal  registration  transformation  T . An  item  of 
local  evidence  is  a pair  of  edge  elements,  one  from  the  first  image  and  one  from 
the  second  Image,  which  could  be  interpreted  as  being  the  same  feature.  The  correct 
transformation  parameter  set  o is  that  set  that  integrated  the  most  local  support. 
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A. 1.2  Object  detection 


Object  detection  has  been  handled  in  various  ways  in  past  work.  The  most 
obvious  technique  is  template  matching  which  can  be  done  in  the  Fourier  domain 
to  achieve  rotation,  translation,  and  scale  invariance.  Duda  and  Hart  [1973] 
and  Rosenfeld  and  Kak  [1976]  describe  2-D  template  matching  and  Zahn  and  Roskies 
[1972]  describe  matching  of  invariant  features  gotten  from  the  1-D  Fourier  expan- 
sion of  the  boundary  of  the  object.  Structural  pattern  recognition  attempts  to 
recognize  objects  as  a synthesis  of  parts  by  either  ad  hoc  or  formal  techniques. 

The  work  of  Guzman  [1971]  and  Shaw  [1970]  are  examples  in  this  category. 

Template  matching  has  as  its  strength  the  ability  to  integrate  over  many 
pieces  of  evidence,  usually  points,  to  reach  a match  and  is  quite  tolerant  of  im- 
perfect or  noisy  input.  While  conceptually  simple  template  matching  can  be  expen- 
sive to  implement  digitally.  Synthetic  techniques  make  a low  level  interpretation 
of  the  data  in  preprocessing  and  then  usually  sequentially  interpret  the  low  level 
primitives  to  assemble  objects.  While  relatively  efficient  in  digital  implementat- 
ions sequential  synthetic  techniques  are  difficult  to  control  in  situations  where 
data  can  be  imperfect  or  noisy. 

A third  object  detection  technique  is  conceptually  described  in  [Duda  and  Hart 
1973,  Sec  12.3]  and  practically  implemented  by  Perkins  [1977].  This  technique  in- 
volves determining  a transformation  that  would  map  the  image,  or  part  of  it,  to  a 
model  of  the  object  and  is  a 3-step  process. 
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(Gl)  Obtain  corresponding  structures  in  the  image  and  model.  Structures 

correspond  when  they  have  the  same  shape,  size,  etc.  Image  structures 
are  to  be  obtained  automatically  while  model  structures  may  be  inter- 
actively compiled. 

(G2)  Determine  transformation  parameters  a=(a, ,a, , . . . ,a  ) such  that  T 

1 2 n a maps 

at  least  some  corresponding  image  structures  (points,  lines,  arcs,  etc.) 
onto  model  structures. 

(G3)  Determine  the  degree  of  match  between  the  entire  set  of  transformed  image 
structures  and  model  structures. 


4.2  A new  procedure  for  registration  and  object  detection 


This  report  describes  a new  technique  for  performing  the  3-step  process  describ- 
ed above.  The  technique  is  a hybrid  of  template-matching  and  structural  analysis  and 
combines  the  advantages  of  those  two  procedures.  The  specific  interpretation  of  the 
general  steps  above  are  as  follows. 

(51)  Assume  all  structures  of  the  same  type  correspond.  For  example,  assume 
each  straight  line  segment  in  the  image  can  correspond  to  each  straight 
line  segment  of  the  model,  each  convex  curve  in  the  image  can  correspond 

\ to  each  convex  curve  of  the  model,  etc.  For  each  pair  of  structures 

(s  ,s  ) , where  s.  and  s are  structures  from  the  image  and  map  respective- 
l m i m 

ly,  compute  transformation  parameters  a and  place  a unit  of  measure  in  « 

' 

. parameter  space. 

(52)  Possible  transformations  I between  image  and  model  are  detected  as  clusters 

t 

in  a parameter  space  formed  in  step  SI  because  heavy  measure  in  a space  means 
that  many  correspondences  are  explained  by  T 
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(S3)  Evaluation  of  the  match  strength  of  each  from  step  S2  is  gotten 
by  either  computing  an  average  distance  between  all  corresponding 
structures  or  by  counting  the  number  of  image  structures  explained 

by  the  model  structures  under  T . 

a 

It  must  be  noted  that  the  procedure  Just  outlined  is  essentially  high-level 
Hough  detection  — distribution  of  mass  to  a parameter  space  to  detect  evidence  of 
global  structure. 

4.2.1  An  illustrative  example 

A simple  example  of  this  process  is  illustrated  in  Figure  24.  Assume  that 
the  image  can  be  represented  by  the  4 directed  edge  elements  shown  in  (a)  while 

the  map  contains  the  edge  elements  in  (b) . It  is  assumed  that  the  length  of  the 

edge  elements  is  accurately  known.  There  are  16  possible  ways  that  an  edge  element 
from  (a)  can  be  paired  with  an  edge  element  in  (b) . Each  pairing  yields  unique 
transformation  parameters  (0,xs,vs)  as  shown  in  (c) . Four  of  the  16  possible  pair- 
ings yield  a consistent  interpretation  — rotate  by  9=0.79  radians  and  translate  by 
(4. 3, 2.0).  The  parameters  from  the  4 correct  pairings  form  a cluster  in  a=(9 ,xs ,vs)  - 
space  while  the  parameters  from  incorrect  pairings  are  sparsely  distributed  in  the 
space.  In  practical  cases  there  will  be  many  more  than  4 primitive  structures  and 
not  all  pairings  will  be  possible  (i.e.  due  to  size  or  shape  differences)  so  the 

presence  of  a cluster  in  the  parameter  space  should  be  even  more  obvious. 
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4.2.2  Clustering  techniques 

Two  different  clustering  techniques  were  used  for  L.N.K.  registration  and 
object  detection  experiments.  They  shall  he  called  the  hierarchical  technique 
and  the  variable  resolution  technique.  In  the  hierarchical  technique  clustering 
was  first  done  on  0 alone  and  then  in  (xs , vs) -space  given  a fixed  6 hypothesis. 
This  technique  was  useful  for  development  and  human  interaction  because  0-space 
could  be  viewed  as  a histogram  and  (xs.vs)-spaee  as  a scatter  plot.  Points  ot 
high  density  in  the  histogram  or  scatter  plots  were  found  either  automatically 
or  bv  human  selection. 

In  general,  true  endpoints  of  edge  elements  cannot  ho  gotten  reliable  as  was 

assumed  in  the  example  of  Figure  24.  However,  good  corner  points  can  be  gotten 

it  correct  pairs  of  edge  elements  are  extended  to  an  intersection.  These  corner 

points  can  then  be  used  to  accurately  define  the  registration  t ransformat ion. 

Details  are  as  follows.  For  each  pair  (s  ,s  1 of  image  edge  elements  and  map  edge 

i m 

elements,  the  rotat  ion  t) , necessary  to  rotate  s,  into  s is  recorded  in  a histo- 

im  t m 

gram.  (, A clustering  space  of  JbO  one-degree  bins.)  The  top  1 peaks  of  the  histo- 


gram, after  smoothing,  are  examined  further.  For  a given  0,  points  in  (xs , vs) -space 

are  gotten  as  follows.  l.et  map  edge  elements  D and  l'  D specify  a ground  control 

m m n n 

point  (GCP)  in  the  map.  For  each  image  edge  element  pair  A. IK  and  A B.  rotating  into 
C D and  C D by  rotation  0,  a translational  component  (xs,vs)  is  readily  computed 
and  a unit  of  measure  placed  in  (xs,ys)-space.  When  all  GCP's  are  treated  in  this 
manner,  clustering  in  (xs ,ys) -space  Is  easily  done  by  examining  a scatter  plot  and 
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a full  set  of  transformation  parameters  a=(0,xs,ys)  is  available.  The  goodness 

of  T can  then  be  evaluated  as  in  step  S3  above, 
a 

The  variable  resolution  clustering  technique  uses  a fixed  10x10x10  partition 
(binning)  of  the  3-D  parameter  space.  After  smoothing,  heavy  regions  of  2x2x2  bins 
are  selected  for  clustering  at  the  next  level.  Points  in  the  2x2x2  bin  region  are 
redistributed  to  the  10x10x10  bins  by  scaling  and  cluster  detection  reapplied.  This 
refinement  process  is  continued  until  there  is  no  more  cluster  evident  or  until  the 
bin  size  reaches  the  limit  of  resolution  inherent  in  the  problem  domain.  Since  the 
size  of  a bin  is  reduced  to  l/5th  size  with  each  level  of  clustering,  3 levels  is 
typical  yielding  3°x 2 pixel  x 2 pixel  bin  size  for  a (0,xs,ys)  originally  constrained 
to  a 360°  x 500  pixel  x 500  pixel  space.  Note  that  only  rotation  and  translation 
parameters  are  used:  transformations  using  scale  changes  as  well  would  involve  4 
parameters  instead  of  3 and  were  saved  for  future  experiments. 

4.3  Image  registration  experiments 

Several  image  registration  experiments  were  run  in  order  to  test  the  viability 
of  the  new  concept.  The  hierarchical  clustering  technique  described  earlier  was 
used  and  there  was  considerable  human  interaction  in  the  early  stages  of  testing. 
Input  to  the  procedure  were  two  sets  of  directed  straight  edge  elements;  one  called 
the  image  and  the  other  called  the  map.  The  image  edge  set  was  extracted  automatic- 
ally by  the  Hough  detector  described  in  Section  2.  The  map  edge  set  was  compiled 
by  humans  making  measurements  on  print  plots  of  the  digital  imagery.  The  resulting 
maps  tended  to  be  very  crude  because  1)  the  amount  of  human  effort  in  making  measure- 
ments was  great  and  2)  a procedure  was  desired  that  would  perform  well  on  crude 
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input  which  would  be  expected  from  automatic  processing.  Several  edge  pairs  whose 
intersections  made  good  ground  control  points  (GCP's)  were  identified  in  each  map. 
Typically  GCP's  were  chosen  as  nearly  90°  intersections  of  large  edges.  GCP's  were 
used  in  the  second  stage  of  hierarchical  clustering  to  determine  the  translation 
(xs,ys).  Output  from  the  registration  procedure  was  a set  of  possible  transfor- 
mations (0,xs,ys)  mapping  the  image  edge  set  onto  the  map  edge  set  and  strength 

value  for  each.  The  strength  of  a registration  transformation  T =T.„  . was 

a (0,xs,ys) 

determined  heuristically  and  left  for  human  interpretation.  Three  numbers  were 

actually  output.  For  each  pair  of  edge  elements  (s.,s  ),  s.  from  the  image  and  s 

t m i °m 

from  the  map,  a value  in  the  range  [0,1]  could  be  assigned  according  to  how  close 
T^(s.)  was  to  s^  in  the  map  coordinate  space.  The  procedure  reported  a count  of 
all  image  edges  s^  and  a count  of  all  map  edges  s^  which  had  no  corresponding  edge 
element  with  above  0 value.  The  sum  of  all  match  values  was  also  reported. 

4.3.1  Image  registration  data  sets 

Three  major  test  sets  were  used  to  test  the  registration  procedure.  The  images 
used  were  AFB2,GAFB,  and  UNB  (shown  in  Figures  11-13  ).  Edge  element  sets 

automatically  extracted  from  these  images  are  shown  in  Figures  11,  14,  and  16. 

The  maps  for  AFB2  and  GAFB  were  made  from  print  plots  of  the  digital  imagery  making 
it  known  a priori  that  the  correct  registration  transformation  was  (0=90° ,xs=0,ys=0) . 
Since  repeat  coverage  was  available  for  UNB,  image  edge  elements  were  extracted  from 
UNB1  (Figure  13  ) and  a map  was  constructed  from  print  plots  of  UNB2  (Figure  7)  . 
Three  ground  control  points  were  used  in  each  UNB  image  to  establish  what  a good 
approximate  registration  transformation  should  be.  Plots  of  the  edge  elements  used 
for  AFB2.GAFB  and  UNB  are  shown  in  Figures  25,26  and  27  respectively.  Details 
of  the  3 test  sets  are  summarized  in  Table  1. 
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4.3.2  Results  of  the  image  registration  experiments 


Table  2 summarizes  the  results  of  the  experiments  with  the  test  images 
described  in  Table  1.  The  results  confirm  the  usefulness  of  the  new  registration 
concept  and  there  should  be  additional  comfort  in  the  fact  that  very  crude  maps 
were  used  and  other  useful  registration  heuristics  were  ignored.  It  is  important 
to  point  out  that  clustering  in  0-space  was  done  automatically  while  clustering  in 
(xs,ys)  was  done  interactively.  The  histogram  of  8 rotations  formed  by  rotating 
each  image  edge  into  each  map  edge  was  first  smoothed  by  a 3°  averaging  window  and 
then  subjected  to  peak  detection.  Up  to  3 of  the  best  peaks  were  passed  on  to  the 
next  stage  of  clustering.  The  threshold  used  was  50%  of  the  number  of  edge  elements 
from  the  image.  Clustering  in  (xs,ys)  space  was  done  inter actively  by  human  exam- 
ination of  scatter  plots  presented  by  computer.  Evaluation  of  the  strength  of  select- 
ed transformations  (9,xs,ys)  were  computed  automatically. 

4. 3. 2.1  AFB2  data  set 

As  Table  2 shows,  there  were  two  strong  responses  in  the  interpretation  space 
(b.xs.ys)  for  AFB2  registration.  Under  the  "correct"  interpretation  (8=90°, xs=2, 
ys*2)  35  of  48  image  edge  elements  aligned  with  13  of  If?  map  edge  elements  for  a 
total  strengtli  of  25.t?7.  Due  to  the  symmetry  in  the  building  patterns  of  AFB2  an 
incorrect  interpretation  (6=270° ,xs=-4t?,vs=-l8)  aligned  15  of  48  image  edge  elements 
witli  4 of  the  It?  from  the  map  for  a total  strength  of  12.71.  This  type  of  ambiguity 
was  observed  to  some  extent  in  all  of  our  imagery  because  man-made  str  ictures  exhibit 
perpendicularity  which  results  in  90°  peak  displacements  in  the  6 -space  of  our  proeedur 
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4. 3. 2. 2 GAFB  data  set 


Figure  28a  shows  a plot  of  smoothed  strength  in  0-space.  There  are  8 


strong  peaks,  one  for  each  map  edge  element,  but  only  the  peaks  at  90°  and  315 


draw  support  from  at  least  half  of  the  image  edge  elements.  A peak  is  observed 


for  each  map  edge  element  because  of  heavy  image  activity  in  the  direction  of 


There  are  45  image  edge  elements  with  direction  277°  + 3 


Under  a 90 


rotation  these  will  align  with  map  edge  it 8 which  has  direction  4° . The  correct 


90°  interpretation  is  also  supported  by  28  other  image  elements  aligning  with  one 


of  the  other  7 map  edge  elements  for  a total  support  of  73  in  0-space.  However 


the  alignment  of  the  45  image  elements  of  direction  277°  with  map  element  #6 


(direction  230°)  supports  the  interpretation  0=315°  with  strength  45.  Random 


support  from  other  alignments  bring  the  support  for  0=315°  to  63,  just  above  the 


threshold.  While  the  2nd  stage  of  clustering  removes  the  contention  from  0=315 


it  is  desirable  to  reduce  ambiguity  in  the  0-space.  This  can  be  done  by  construe 


tion  of  a more  comprehensive  map  for  GAFB  so  that  most  of  the  image  edge  elements 


would  support  the  correct  rotational  interpetation  and  hence  enhance  its  strength 


relative  to  alternatives.  Figure  28b  shows  clustering  in  (xs,ys)-space  under  the 


hypothesis  that  0=90°.  Figure  28c  shows  a scale  refinement  of  part  of  Figure  28b 


which  enabled  a choice  of  cluster  center  (xs=2,ys=3)  and  an  evaluation  of  support 


xs=2,ys=3)  to  be  30.15.  No  cluster  for  0=315°  exhibited  any  strength 


xs=2,ys--3)  transformation 


4. 3. 2. 3 UN B data  set 


Table  2 shows  that  our  procedure  uncovered  a very  good  approximate  registr- 
ation transformation  as  the  strongest  interpretation,  exceeding  ali  others  by  at 
least  a factor  of  2.  That  transformation,  (0=330° ,xs=-142 ,ys=8) , aligned  19  of  200 
image  edge  elements  with  12  of  22  from  the  map.  The  strongest  contender,  (0=237°, 
xs=453,ys=-433)  aligned  only  6 image  edge  elements  with  2 from  the  map.  We  believe 
that  construction  of  a more  comprehensive  map  would  have  enhanced  these  results  just 
as  it  would  have  in  the  GAFB  case. 

4.3.3  Discussion  of  the  image  registration  experiments 

The  results  indicate  that  the  registration  concept  introduced  in  Section  4.2 
can  be  used  to  register  image  edges  to  model  edges.  Because  global  interpretations 
are  formed  by  integration,  supporting  local  evidence  can  be  incomplete  or  in  error 
and  the  correct  interpretat ion  can  still  be  obtained.  Moreover,  the  process  of 
accumulating  support  is,  unlike  many  synthetic  techniques,  independent  of  the  order 
in  which  local  evidence  is  considered.  In  the  experiments  the  correct  interpretation 
always  dominated  incorrect  interpretations  in  strength  of  support. 

Unlike  other  registration  techniques  the  new  technique  need  make  no  assumption 
that  an  approximation  to  the  transformation  parameters  be  known  a priori.  Thus  this 
procedure  is  a candidate  front-end  procedure  to  more  sophisticated  non-linear  pro- 
cedures which  require  a good  start  in  their  hill-climbing.  More  general  transformat- 
ions can  be  handled  in  a similar  manner.  However,  it  must  be  pointed  out  that  cluster- 
ing techniques,  such  as  the  Hough  transform,  lose  efficiency  when  the  size  of  the 
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parameter  space  increases.  The  simplicity  of  the  new  technique  makes  it  quite 
attractive  for  rotations  and  translations.  The  complete  GAKB  image  was  repre- 
sented by  124  straight  edge  elements,  or  124x4  integers,  whiie  the  GAKB  map  was 
specified  by  only  8 edge  elements. 

The  results  reported  in  Table  2 were  gotten  with  automatic  peak  detection 
in  the  0-space  and  interactive  cluster  detection  in  the  (xs.ys)-space.  The  (xs,ys) 
-space  clustering  was  then  completely  automated  and  similar  results  were 
obtained.  The  ambiguities  arising  in  images  with  high  culture  content  make  the 
hierarchical  clustering  procedure  suspect.  The  ambiguity  present  in  Figure  28  a 
is  rather  alarming.  Combination  of  edge  elements  (i.e.  to  get  Figure  1 *>  from 
Figure  14  improved  this  situation  considerably.  However,  further  experiments  in 
object  detection  showed  that  many  of  the  highest  peaks  in  0-space  could  be  false 
and  that  3-D  clustering  is  really  necessary.  The  next  section  discusses  the  more 
robust  clustering  technique  in  the  framework  of  object  detection. 

4.4  Object  detection  experiments 

j 

Section  4.2  introduced  a new  concept  for  registration  which  used  clustering  to 
form  a globally  valid  registration  transformation  from  local  evidence  of  matching 
structures.  Experiments  in  registering  images  to  maps  were  discussed  in  Section  4.3. 
Although  object  detection  is  the  primary  concern  of  the  present  section  it  should  be 
remembered  that  techniques  applicable  here  will  also  apply  to  general  registration 
problems. 
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Three  types  of  structures  have  been  used  in  L.N.K.  research  as  primitive 
structures  for  registration.  They  are  all  derivable  from  the  output  of  edge 
extraction.  Extraction  of  primitives  was  discussed  in  Section  2.  Since  the 
registration  procedure  is  quite  tolerant  of  errors  the  feature  extraction  pro- 
cedures themselves  need  not  be  highly  refined.  The  three  structures  used  for 
registration  are  as  follows. 

(El)  Straight  edge  elements 

Straight  edge  elements  indicate  a straight  side  of  an  object 
and  are  represented  by  endpoints  A* and  B* as  in  Figure  29 
Traversal  of  the  edge  from  k to  B/ keeps  the  darker  halfplane 
to  the  right.  Model  edges  are  constructed  by  human  or  inter- 
actively and  are  assumed  to  have  accurate  endpoints.  Image 
edges  are  gotten  automatically  and  endpoints  are  assumed  to  be 
inaccurate. 

(E2)  Points  of  sharp  curvature 

Points  of  high  curvature  on  curved  edges  are  typed  as  con- 
vex or  concave  according  to  whether  the  inside  of  the  curve 
is  darker  or  lighter  than  the  outside.  The  type  is  easily 
coded  as  a sign  on  the  curvature  as  the  curve  is  traversed 
with  the  darker  region  to  the  right. 

(E3)  Points  of  angular  intersection 

Two  straight  edges  intersecting  at  a point  forming  a 
blunt  angle  create  an  intersection  point  which  may  be  typed 
according  to  the  angular  size,  i.e.  60°-120° ,90° ,etc . Inter- 
section points,  are  easily  gotten  by  considering  nearby  pairs 
of  detected  line  segments  as  in  El. 
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4.4.1  Determining  transformation  from  only  straight  edge  correspondences 


Let  points  A and  B define  an  image  edge  structure  and  let  points  C and  D define 

a map  edge  structure.  Let  (0^,r^)  and  respectively  be  the  polar  form  of  the 

halfplanes  determined  by  edge  AB  in  the  image  coordinate  system  and  edge  CD  in  the 

model  coordinate  system.  We  compute  a,  or  a set  of  a,  so  that  edge  AB  maps  onto 

edge  CD  under  transformation  T . Only  rotation  and  translation  will  be  allowed  here 

a 

so  that  a=(0,xs,ys).  Given  (0.,r.)  and  (0  ,r  ) 0 is  easily  determined  as  0 -0.. 

i l mm  mi 

Once  edge  AB  is  rotated  by  to  get  edge  A'B'  the  configuration  of  Figure  29 

is  obtained.  The  translational  part  of  a is  then  constrained  by  the  equation. 

Ax  cos  0 + Ay  sin  A + (r.-r  )=0. 

m m l m 

Actually,  since  A'B'  is  a finite  line  segment  Ax, Ay  are  further  constrained  be- 
cause point  A'  should  translate  no  further  than  point  C and  point  B'  should  trans- 
late no  further  than  point  D.  (Some  tolerance  can  be  given  since  the  edge  detector 
can  overshoot  corners.) 


The  result  is  that  the  correspondence  of  image  edge  AB  and  model  edge  CD  yields 
a segment  iri  a - space  between  a points  (0,xs^,ys^)  and  (Q,xs,ys.,).  If  all 

edges  are  paired  with  all  model  edges,  a - space  will  be  sparsely  filled  with  line 
segments  from  incorrect  correspondences  but  will  contain  a cluster  of  intersecting 
line  segments  for  the  set  of  correct  edge  correspondences. 


Figure  30  shows  a set  of  model  edges  defining  an  airplane.  These  were  gotten 
by  human  measurement  of  a blowup  of  a grey-scale  picture  of  an  airfield.  Figure  31 
shows  a set  of  image  edge  elements  extracted  from  a 512x512  area  of  the  airfield. 
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Figure  32  is  a plot  of  the  line  segments  in  that  slice  of  a - space  with  320  <_ 

0 040.  The  rest  of  a - space  is  quite  barren.  From  the  data  shown  in  Figure  32 
transformation  parameters  a=(0=33O° ,xs=-104 ,ys=42)  were  gotten  by  a simple  binning 
implementation  of  clustering.  Under  T only  one  edge  of  the  10  model  edges  (the 
back  of  the  tail)  was  not  matched  to  image  edge  data.  Further  experiments  of  this 
kind  are  detailed  in  Section  4.4.3. 

4.4.2  Determining  transformation  from  correspondences  of  abstract  edges 

As  Figure  32  shows,  u - space  can  be  quite  noisy  due  to  incorrect  corres- 
pondences and  to  too  little  constraint  on  xs.ys.  Both  noise  sources  can  he  signi- 
ficantly reduced  if  the  lengths  of  corresponding  edge  structures  are  forced  to 
agree.  Thus  the  endpoints  of  the  edge  structures  must  be  reliably  delimited.  To 

achieve  this,  point  structures  can  be  detected  and  abstract  edges  formed  by  spanning 

» 

pairs jf points.  These  edges,  assumed  now  to  have  accurate  length,  can  be  used  for 
registration  exactly  as  discussed  in  Section  4.2. 

A specific  case  is  treated  here  but  the  general  concept  should  be  obvious 
from  this  example.  Abstract  straight  edges  can  be  formed  by  using  an  intersection 
point  as  the  vector  tail  and  a high  curvature  point  as  the  vector  head.  Figure  33 
shows  the  model  of  the  airplane  under  this  new  scheme.  There  are  6 intersection 
points  (4-wing-fuselage  intersections  and  2 tail-fuselage  intersections)  and  5 points 
of  high  curvature  (2  wing  tips,  2 tail  tips,  and  the  nose)  yielding  30  abstract  edges. 
Compare  Figures  33  and  30. 


The  straight  edge  content  of  a second  512x512  area  of  the  airfield  image 
is  shown  in  Figure  34.  Points  of  intersection  were  gotten  from  this  data 
and  combined  with  high  curvature  points  extracted  by  another  algorithm  to  get 
the  set  of  abstract  image  edges  plotted  in  Figure  35.  Note  the  confusion 
created  by  the  presence  of  incomplete  structures  from  2 airplanes.  The  cluster- 
ing succeeded,  nevertheless,  in  detecting  a transformation  T that  mapped  31 
abstract  image  edges  onto  16  abstract  model  edges.  Figure  36  gives  a plot 
of  only  those  edge  elements  from  Figure  35  that  matched  edge  elements  of  the 
model.  Note  that  the  31  abstract  edge  elements  represent  only  16  true  edges. 

There  is  a duplication  of  abstract  edges  because  several  edge  detections  along 
straight  edges  caused  duplication  of  intersection  points  as  is  evident  in  Figure 
35.  Merging  copies  of  the  same  intersection  would  have  been  an  improvement  but 
was  felt  to  be  unnecessary. 

4.4.3  Experiments  in  object  detection 

The  procedures  illustrated  in  Sections  4.4.1  and  4.4.2  were  tested  on  several 
windows  of  the  AFB  image  (Figure  11).  Since  the  airplanes  were  known  to  be  of 
rough  size  256x256  pixels,  testing  a set  of  overlapping  512x512  pixel  windows  was 
guaranteed  to  consider  all  relevant  areas.  7 windows  were  required  to  cover  the 
entire  plane  parking  area:  two  of  these  windows  in  fact  contained  no  planes  and 
two  windows  contained  all  or  part  of  two  planes.  The  nose  and  wing  tips  of  each 
plane  were  identified  by  a human  and  a "correct"  reference  transformation  was  deter- 
mined. Actually  the  transformation  was  computed  in  3 ways  so  that  a measurement 
error  could  be  established.  Table  3 contains  a summary  of  the  detection  experi- 
ments. The  centers  of  the  windows  are  listed  in  Column  1 while  the  human  computed 
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registration  transformation,  relative  to  the  window  centers,  is  given  in  Column  2. 
For  example  if  all  points  of  the  window  centered  at  (256,1536)  were  shifted  by 
x'=x-256,y'=y-153b  then  a transformation  of  (0=330° ,xs=-55,ys=69)  wouid  align 
airplane  It 4 with  the  model  in  Figure  30. 


Results  of  automatic  object  detection  via  registration  are  given  in  Columns  3 
and  4 of  Table  3.  The  parameters  of  possible  registration  transformations  are 
given  at  the  left  of  Columns  3 and  4 and  the  quality  of  match  achieved  is  evaluated 
at  the  right  of  Columns  3 and  4.  For  example,  the  best  possible  detection  of  air- 
plane II 4 in  sub-window  6 using  straight  edge  evidence  occurs  with  transformation 
(0=332° ,xs=-59,ys=66) . Under  this  transformation  18  out  of  70  edge  elements  from 
the  image  map  onto  9 out  of  10  edge  elements  in  the  model  with  an  average  match 
value  of  63.30  (out  of  a possible  100).  This  heuristic  average  match  value  is  in 
fact  better  than  that  achieved  with  the  human  determined  transformation  which  is 
evaluated  in  the  starred  row  of  Columns  3 and  4.  Using  abstract  edge  evidence  the 
strongest  cluster  in  a - space  was  detected  at  (0=332° ,xs=-61 ,ys=61) . This  trans- 
formation aligned  52  of  155  abstract  image  edges  with  10  of  30  abstract  model  edges. 
The  average  of  the  52  match  values  was  29.06;  not  great  but  a bit  higher  than  the 
average  match  for  the  human  computed  transformation. 


A binary  decision  PLANE  versus  NO  PLANE  was  not  made.  This  is  consistent  with 
our  philosophy  that  other  information  should  be  integrated  into  the  final  decision. 
Search  for  unmatched  model  edges  to  verify  a possible  detection  is  discussed  in 
Section  5 of  this  report.  It  can  be  seen  from  Table  3,  however,  that  no  detection 
will  be  falsely  dismissed  with  the  current  stage  of  processing.  There  are  some  false 
alarms  appearing  at  this  stage  — their  removal  will  be  treated  in  Section  5. 


! 
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It  is  interesting  to  note  the  ambiguities  revealed  in  Table  3.  For  instance, 
in  subwindow  2 a fairly  strong  detection  was  made  with  0=145°.  This  is  180°  off  of 
the  correct  rotation  and  shows  that  the  airplane  does  have  some  rotational  symmetry. 
While  fewer  edge  elements  align  with  the  145°  rotation  the  average  match  value  of 
the  aligning  edges  is  higher  than  for  the  correct  transformation.  Registration  of 
subwindows  3,4  and  5 shows  that  some  90°  and  270°  rotational  symmetry  also  exists. 
The  most  valuable  information  for  removing  the  ambiguity  lies  in  the  edges  of  the 
tail  which  apparently  are  the  most  difficult  to  extract  bottom-up.  Top-down  test- 
ing for  previously  undetected  tail  edges  is  discussed  in  Section  5. 

4.5  Discussion  of  registration  results 

We  interpret  the  results  to  be  highly  promising  and  to  have  demonstrated  the 
features  of  the  new  registration  procedure. 

. The  procedure  is  fairly  efficient  since  it  uses  only  cheaply 
extracted  primitive  features  of  imagery.  Recall  that  only  124x4 
integers  were  used  to  represent  the  GAFB  image  and  only  8x4  integers 
represented  the  GAFB  map. 

. The  procedure  can  operate  with  fair  amounts  of  missing  information 
or  irrelevant  background.  See  Figures  30  to  32  for  example. 

. The  procedure  does  not  require  a starting  approximation  to  the 
transformation  sought.  In  our  experiments  we  have  assumed  only  that 
the  rotation  0 was  confined  to  (0,360°)  and  that  xs  and  ys  were  within 
half  the  window  size  in  either  direction.  In  particular  such  freedom 
enables  the  procedure  to  be  used  for  detection  of  moveable  objects — 
objects  in  aerial  imagery  or  parts  on  an  assembly  line. 
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The  cost  of  the  new  procedure  shouid  be  considered  in  more  detail.  The  chief 
cost  component  is  the  extraction  of  the  image  structures,  i.e.  the  edge  elements 
or  points.  Fortunately,  hardware  can  be  used  for  extraction  of  straight  edges 
[Stockman  1977].  If  there  are  i edge  elements  from  the  image  and  m from  the  map 
then  i m pairs  are  formed  for  all  possible  structure  correspondences.  If  the  edge 
element  length  is  reliable  almost  all  of  the  i m pairs  are  rejected  immediately 
while  the  remaining  few  are  used  to  define  points  in  a - space.  Clustering  is 
done  on  f i m line  segments  where  0 f <_1.  The  clustering  procedure  uses  10x10x10 
bins  so  the  cost  of  smoothing  and  peak  detection  is  constant  c^  while  the  cost  of 
filling  the  bins  varies  linearly  with  the  number  of  possible  edge  correspondences 
c.,fim.  Clustering  is  typically  repeated  3 times  until  the  bin  scale  is  comparable 
to  the  accuracy  obtainable.  Thus  the  total  cost  is  3(c^+c?  fim) . Since  ra  can  be 
controlled  to  be  a small  fixed  number  the  total  cost  can  practically  be  conceived 
as  linear  in  i,  the  number  of  image  edge  elements.  Since  not  all  of  the  edge  ele- 
ments need  be  used  to  establish  registration  some  scattered  sampling  of  them  could 
be  used  and  thus  i.  could  be  controlled  as  well. 

Registration  of  large  areas  is  aided  by  the  presence  of  map  features  over  wide 
regions  of  the  map.  Object  detection  on  the  other  hand  requires  separate  consider- 
ation of  locally  confined  sets  of  edges.  If  the  object  fits  in  a square  of  side  s 
we  examine  all  2sx2s  areas  with  endlap  and  sidelap  of  s.  This  guarantees  that  every 
object  will  be  enclosed  in  at  least  one  area.  As  Figure  34  shows  more  than  one 
object  can  be  in  one  area.  Because  many  registrations  have  to  be  done  to  detect 
small  objects,  global  registration  of  the  general  area  witli  a base  map  is  worth- 
while in  order  to  delimit  what  specific  areas  are  to  be  examined  for  which  specific 
objects.  Planes  are  sought  on  airstrips,  and  ships  are  sought  in  water,  for  instance. 
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More  work  needs  to  be  done  in  testing  the  use  of  abstract  edges  in  registration 


( 


of  varied  terrain,  perhaps  with  little  cultural  activity,  to  archived  maps.  A pilot 
experiment  was  done  toward  this  end.  A sheet  of  graph  paper  was  overlaid  on  the 
feature  map  of  Harrisburg  depicted  in  Figure  17  and  prominent  intersection  and 
curvature  points  were  identified  by  researcher  #1.  The  process  was  then  repeated 
by  researcher  iH  using  a different  placement  of  the  graph  paper,  i.e.  a different 
coordinate  system.  Not  only  were  the  resulting  point  features  in  different  coordinate 
systems,  but  there  were  different  sets  of  points  identified  by  the  two  researchers. 
This  situation  compares  to  what  would  likely  result  from  automatic  detection  in  re- 
peated coverage.  Despite  some  differences  in  the  selected  structures  there  was 
enough  in  common  to  produce  a large  cluster  in  parameter  space  to  identify  the  cor- 
rect transformation  between  coordinate  systems.  Hopefully,  this  effect  can  be  re- 
peated in  future  experiments  matching  aerial  imagery  to  maps. 

Further  work  needs  to  be  done  with  transformations  of  more  than  3 parameters. 

For  aerial  imagery  scale  differences  between  image  and  map  should  certainly  be 
handled.  For  images  providing  perspectives  on  a 3-D  world  up  to  6 parameters  might 
be  required.  Use  of  4-D  clustering  for  a full  RS&T  transformation  does  not  seem  to 
present  a much  greater  problem  than  the  3-D  case,  particularly  if  a hierarchical 
approach  can  be  used.  However,  the  simple  local  structural  correspondences  currently 
being  used  are  insufficient  for  determination  of  6 parameters  with  subsequence  place- 
ment of  mass  in  a b-D  a - space. 
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Figure  26.  GAFB  edge  element  map  made  by 

human  on  print  plotted  picture. 
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(b)  GAFB : High  level  clustering  in  (xs.ys) 
-space  under  the  hypothesis  that  6"90° . 


SUMMARY  OF  REGISTRATION  RESULTS 
ON  THREE  EXPERIMENTAL  DATA  SETS 


*indicates  no  viable  alternative  cluster 
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Im^e  edge  AB  or  (t^.r^  is  rotated  ^-0  { into  A ' B ' or  (B^r^  to  be  parallel  to 

map  edge  CD  or  (0  , r ) 
m m 

unit  vector  from  A'  to  R is  (-cos  0^,-sin  0^),  projection  of  A'C  onto  A'R  has 

constant  length  yielding  equation  relating  Ax  and  Ay  of  translation 
-► 

A'C  o (-cos  6 ,-sin  0 ) - r - r 
m m i m 

(Ax, Ay)  o (-cos  B ,-sin  0 ) - r - r 
m m l m 

Ax  cos  0 + Ay  sin  0 + (r  - r ) « o 

m m i m 

Figure  29.  Derivation  of  a “ (B.Ax.Ay)  mapping  image  edge  element  (0  r ) 
onto  model  edge  element  (0  ,r).  ^ 
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Figure  10.  Model  of  airplane  in  terms  of  directed  straight  edge  segments. 
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Figure  36. 


Edge  elements  of  Figure  35.  which  match  edge  elements 
of  Figure  34.  under  automatically  derived  registratioi 


transformation  T *(9=330° ,xs=-134,ys=-18) . 
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TABLE 


value  of  avg.  match  weight  using  the  "true"  transformation 

no  points  of  high  curvature  were  found  in  this  area,  therefore  no  abstract  edges 
could  be  formed  and  no  registration  took  place 


5.  Verification  of  structures  in  Imagery 


In  this  section  it  is  assumed  that  enough  information  has  been  extracted 
from  an  image  so  that  hypotheses  about  the  remaining  image  content  can  be  made. 

Our  interest  is  restricted  to  geometric  structures  in  this  report.  In  the  veri- 
fication of  the  existence  of  a particular  geometric  structure,  the  rough  location 
and  orientation  of  the  structure  is  known.  That  is,  if  the  structure  exists  at 
all,  the  model  being  used  should  predict  approximately  where  the  structure  is 
with  respect  to  previously  detected  structures,  how  big  it  is,  what  its  shape  is, 
etc..  For  example,  if  structures  resembling  the  two  wings  of  an  airplane  have 
been  detected  there  are  at  most  two  places  to  search  for  the  tail.  Finding  a 
hypothes ized  structure  greatly  increases  the  confidence  in  the  model  that  gener- 
ated the  hypothesis  while  failure  to  detect  the  predicted  structure  has  )ust  the 
reverse  effect. 

Because  verification  is  done  with  model  prediction  focused  searches  can  be 
performed.  Not  only  is  the  area  of  imagery  to  he  searched  well-confined  hut  there 
are  also  tight  constraints  on  shape  and  orientation.  Thus  faint  or  hard  to  detect 
features  can  he  found  more  reliably  and  more  efficiently  in  the  top-down  mode  than 
in  the  bottom-up  mode. 

Three  techniques  are  described  lor  verification  (top-down  detection)  ot  geo- 
metric structure  in  imagery.  All  three  techniques  operate  on  edge  or  gradient  in- 
formation to  detect  boundary  segments  of  a certain  shape.  In  Section  S.l  "certain 
shape"  is  defined  by  a functional  model  and  tin*  verification  paradigm  is  curve-fitti 
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with  X evaluation  of  the  results.  Hough  detection  of  parabolic  or  circular 
arcs  is  discussed  in  Section  5.2.  This  approach  is  essentially  template  match- 
ing where  each  template  is  defined  by  a set  of  parameters.  A practical  approach 
for  verifying  a fixed  but  arbitrary  curve  structure  is  given  in  Section  5.3. 
There  a boundary  feature  is  viewed  as  a set  of  high  gradient  points  that  must 
be  found  in  the  image.  This  is  also  a template-matching  approach  but  without 
parameter ization. 

5.1  Verification  of  generic  shapes  via  curve-fitting 
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The  following  technique  was  designed  and  implemented  in  1-D  for  detection 
of  shape  features  in  waveforms.  The  system  implemented  was  called  WAPSYS,  short 
for  waveform  parsing  system  [Stockman  1977).  The  treatment  will  thus  be  formally 
given  along  those  lines.  A boundary  curve  in  a 2-D  image  is  basically  a l-D  entity 
and  so  the  1-0  method  cun  be  converted  to  the  2-0  case.  Methods  for  conversion 
and  the  problems  encountered  are  discussed  at  the  end  of  this  section. 


5.1.1  Extracting  primitives  by  curve-fitting 


We  define  a waveform  as  a finite  function  W*  {(  x ,y  ) ) , J“l  ,N  ; i.e.  there  are 


a finite  number  N ot  pairs  (x.,\'j)  and  no  two  pairs  have  the  same  first  element. 


Since  x is  a time  or  distance  variable,  there  is  a linear  order  imposed  on  the 
set  of  points,  i.e.  x.  is  before  x(+j  and  after  x^^.  The  task  is  to  verify  if 
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some  subset  (subinterval)  of  points  t(x  -a ,y  ) . . . (x  »b uy ) } has  shape  M.  M is  a 

.1  I k ■ 


L 


no 


shape  primitive  or  morph.  The  set  of  points  on  which  the  search  for  M is  con- 
strained is  called  the  constraint  interval  and  is  indicated  as  interval  [£,r]. 

The  sub interval  on  which  morph  M is  detected,  if  it  is  detected,  is  called  the 
match  interval.  The  detector  may,  in  fact,  identify  no  occurrence,  one  occur- 
ence, or  many  occurrences  of  the  morph  <M, [a^ ,bj ] ,e^ ,p^>  existing  on  the  con- 
straint interval.  is  the  parameterization  of  the  morph  M and  is  an  eval- 
uation of  its  merit  or  certainty.  The  morph  M is  specified  to  the  detector  by 
a syntactic  name  and  semantic  constraints  C which  must  be  satisfied  by  parameter- 
ization P.  M might  be  formally  defined  as  a functional  form  y^sy^Cx) »f (x)  to  be 
fit  to  the  data  y(x),xe[a,b]  under  set  of  constraints  C. 


2 

For  example,  the  'cap'  morph  of  Figure  37  could  be  defined  as  y^Cx^p^x  + 

PiX+p.  subject  to  the  constraints  that  y (a)=y  (b) ,c,  < p„  < c„  < 0 and  c.  < b-a  < c, . 
rl  9 m 'm  1 r2  2 3 — 4 

The  parameterization  P={Pq,P^,P2)  could  naturally  be  determined  by  least  squares 
fitting  ym(x)  to  y(x)  over  xe[a,b].  Figure  37  shows  5 of  the  morphs  used  exten- 
sively in  a pulse  wave  application  detailed  in  [Stockman  1977].  All  5 morph  de- 
finitions imply  least  squares  error  estimation  of  2 free  parameters.  Under  the 

assumption  of  Gaussian  noise  distributed  as  N(0,cr2)  the  variable 
b 

s = £ (y  (x)-y(x))2)/o2  is  x2  distributed  with 
x=a 

b-a-1  degrees  of  freedom  if  the  values  of  y(x)  are  interpreted  as  realizations  of 

2 

yn(x)  plus  noise.  By  defining  e as  the  probability  that  x (z,b-a-l)>s,  the  merit 
or  certainty  of  the  fit  (or  morph  hypothesis)  is  nicely  bounded  in  [0,1]  and  may 
indeed  be  interpreted  as  a probability. 
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Figure  $8  shows  some  pulse  wave  data  that  was  fitted  with  models  from  Figure 
i7.  The  morphs  UP ,MPS,MNC.,LN , and  HOR  are  defined  and  detected  by  using  constraints 
on  the  parameters  of  the  straight  line  model.  The  CAP  morph  and  RSI1  morph  are  instances 
of  the  cap  and  right  shoulder  of  Figure  17,  Constraining  the  juxta-poslt ion  of  these 
morphs  is  in  the  domain  of  syntax  which  was  discussed  in  Section  3. 

S.1.2  A curve-fitting  detector 

To  find  the  optimum  tlt(s)  of  a given  morph  on  some  subinterval  |a,b)  of  con- 
straint interval  [<,r|,  a large  number  of  fits  might  have  to  be  tried.  if  the  bounds 
c.  and  c,  of  h-a  are  stringent  and  the  interval  |t,r)  is  not  big,  the  entire  Interval 


112 


Such 


> 

» 


■1 


i 


[i,r]  can  be  scanned  for  all  subintervals  [a,b]  satisfying  c^b-a^c^. 
exhaustive  scanning  can  be  enforced  by  WAPSYS  and  guarantees  that  the  optimal  fit 
under  the  given  constraints  is  found.  If  the  morph  width  b-a  can  vary  widely  or 
if  the  interval  [£,r]  is  big,  optimality  can  be  sacrificed  for  efficiency  through 
a heuristic  search  alternative.  Details  of  the  heuristic  search  alternative  can 
be  found  in  [Stockman  1977J  or  [Stockman  1978]  and  are  ignored  here  because  it  is 
assumed  that  for  verification  the  search  interval  [£,r]  will  be  well  enough  con- 
strained for  exhaustive  search.  Exhaustive  search  is  faster  under  strong  constraints, 
partly  because  recursively  updateable  curve  fitting  can  be  done. 

With  exhaustive  search  a fair  number  of  fits  may  be  made  and  several  of  them 
may  yield  high  x2  values.  All  fits  (detections)  above  a certain  threshold  are  saved 
in  a ranked  list  while  those  below  threshold  are  discarded.  The  list  of  qualifying 
detections  is  passed  back  to  the  syntax/semantics  module  for  fur ther  analysis . Since 
the  size  of  a morph  is  constrained  over  a flexible  range  it  will  often  happen  that 
one  detection  will  overlap  another  or  be  entirely  contained  in  another.  For  this 
reason  pruning  is  done  on  the  list  of  detections  so  that  redundant  detections  are 
discarded.  Let  each  interval  of  data  [a,b]  fit  by  the  detector  in  search  of  morph 
M be  called  state  [a,b].  The  conditions  for  suppressing  one  detection  in  the  pre- 
sence of  another  are  set  forth  as  follows. 

Definition  5.1 

Let  [a,b]  and  [c,d]  be  two  intervals  of  the  real  line  and 
let  s( [x1,X2])»X2~x^  be  the  length  measure  for  interval  [x^.x,,]. 

The  overlap  or  correlation  between  intervals  [a,b]  and  [c,d]  is 
defined  as  OVRLAP( [a,b] , [c,d])«2s( [a,b]fl[c,d] y(s( [a,b])+s([c,d])) . 

113 


Definition  5.2 


Let  interval  [L^.Rj  be  fit  with  quality  E^  and  [ L . , R ^ ] 
be  fit  with  quality  Ej  in  two  states  of  the  curve  fitting. 
State  [L.,Rj  is  said  to  dominate  state  [L^,R.]  if  and  only  if 
a)E.  > Ej 

and  b)[L  ,R  ]ClL1,R1l  or 

OVRLAP([Li,R.] ,[I^  ,R. ])>t  and 
sULj.Rj])!  s([L1,R1]). 


Roughly  speaking,  fit  i dominates  fit  j if  it  has  at  least  as  good 
a quality  and  covers  at  least  as  many  points  in  the  same  general 
region  of  data. 


The  5 morphs  described  in  Figure  37  were  implemented  as  FORTRAN  subroutines  for  use 
in  several  domains.  The  subroutines  CAPFIT .LINFIT , RSHFIT .LSHF1T , compute  the  fit 
parameters  for  a given  sub  interval  of  data  and  also  return  the  sum  of  the  squared 
errors.  A driver  program  (FITTER)  controls  the  choice  of  subintervals  to  be  fit 
and  interprets  the  quality  of  fits  by  calling  routine  QCHISQ  to  make  the  chi-quared 
comparison  of  fit  error  to  noise.  The  set  of  fit  constraints  C=(c ^ ,c0 ,c^ ,c^)  is 
gotten  by  FITTER  via  a call  to  GETCST,  which  gets  them  from  tabularized  input  from 
the  user.  FITTER  returns  when  the  search  is  complete.  An  entry  point  NXTFIT  is 
then  available  to  get  successively  worse  fits  from  the  list  of  FITTER  detections. 


i, 


y (x)  - p2*  *Plx+p0 


(c)  right  shoulder 


(d)  left  shoulder 


(e)  straight  line 


Figure  37.  Five  morphs  defined  by  constrained  fits  of  model  y^fx)  to 
data  y(x),  xe [a,b lg(  t . r I . 
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Carotid  pulse  wave  sample  segmented  into  primitive  shape  feature 


5.1.3  Training  and  testing  curve  fitting  detectors 


While  general  morphs  such  as  the  ’cap’  or  the  ’line’  might  be  applicable 
in  several  different  problem  domains,  it  is  unlikely  that  the  constraints  would 
be  defined  in  the  same  way  in  each  domain.  These  constraints  could  be  defined 
from  a priori  considerations  or  could  be  "learned"  from  data  samples.  WAPSYS 
can  be  used  to  define  morph  constraints  by  fitting  training  data  under  no  con- 
straints and  recording  the  parameterizations  that  result.  The  set  of  parameteri- 
zations  for  each  named  morph  can  then  be  converted  to  constraints  for  use  in 
automatic  analysis. 


WAPSYS  was  used  to  learn  the  constraints  necessary  for  automatic  parsing  of 
carotid  pulse  waves.  14  morphs  were  identified  in  this  application  and  all  were 
expressible  in  terms  of  constraints  on  generic  features  as  shown  in  Figure  27 
Eight  hours  was  spent  by  the  author  examining  print  plots  of  20  sample  pulse  waves. 
Aided  only  by  a ruler,  all  the  data  was  segmented, producing  for  each  sample  a list 


of  triples  <M.,a.,b.>  where  M.  was  a 3 character  morph  name  and  [a.,b.j  was  the 

ill  l r i i 


interval  of  data  on  which  the  morph  was  declared  to  exist.  A59  morphs  were  iden- 
tified in  this  manner  for  the  20  waveforms.  A training  routine  was  written  which 
called  the  fitting  routine  appropriate  for  each  morph  specified  and  forced  it  to 
fit  the  specified  data  interval.  The  fit  was  forced  in  the  sense  that  the  noise 
tolerance  was  varied  upward  until  either  a limit  was  reached  or  a fit  of  quality 
above  0.5  was  achieved.  (The  noise  limit  was  useful  for  detecting  human  errors  in 
creation  of  the  training  items — 7 or  8 errors  were  detected  in  this  manner) . The 
parameters  of  a successful  fit  were  contributed  to  a running  statistical  summary 
which  was  the  final  output  of  training. 
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The  statistical  summary  contained,  for  each  morph,  the  lowest,  highest, 
and  mean  value  for  each  of  3 variables  and  the  standard  deviation.  The  variables 
were  interval  width  b-a,  noise  o2,  and  curvature  for  parabolic  morphs  or  slope  for 
linear  ones.  3 iterations  of  training  were  required  to  remove  all  human  errors  in 
the  459  training  items.  The  summary  values  from  training  were  then  used  to  con- 
struct a table  of  constraints  to  be  used  in  automatic  detection  by  the  FITTER  algo- 
rithm. A testing  module  was  written  for  WAPSYS  to  drive  the  FITTER  algorithm  in  an 
attempt  to  automatically  identify  the  same  morphs  that  the  human  user  did  in  con- 
structing the  list  of  training  items.  The  testing  routine  reads  each  training  item 


<Mj,a^,b.>,  computes  an  enlarged  constraint  interval  [H.,r.]  from  [a.,b.]  and  calls 


the  FITTER  to  detect  morph  M.  on  the  interval  f , r . ] . The  list  of  successful  fits 

l l i 


(if  any)  is  then  scanned  to  find  that  fit  which  correlates  best  with  training  item. 
A report  of  the  fits  achieved  for  each  training  item  is  given  as  a final  summary  of 
how  well  the  morphs  detected  correlated  with  the  training  items.  The  constraint 
interval  for  each  training  item  is  computed  as 


£^=max(l,a^  - 2(b^-a^+l)z^) 


r.=min(n,b.  + 2(b.-a.+l)z  ) 


where  n is  the  total  number  of  points  in  the  waveform  and  and  z.,  are  independent 


samples  from  the  unit  normal  distribution  N(0,1).  The  correlation  between  two  fits 
was  given  in  Definition  1.  Since  many  fits  can  be  detected  in  testing,  the  best 
correlating  fit  is  used  to  contribute  to  the  correlation  summary.  Results  of  test- 
ing should  indicate  whether  or  not  the  models  used  for  detection  can  or  cannot  suc- 
ceed in  a given  application.  Failure  in  the  training  and  testing  cycle  indicates 
the  need  to  define  a different  set  of  primitives.  In  the  carotid  pulse  wave  appli- 
cation the  results  of  training  and  testing  were  very  good.  These  results  and  the 
results  of  further  analysis  of  a larger  set  of  waveforms  is  discussed  next. 
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Figure  38  shows  a syntactic  labeling  of  2 consecutive  cycles  of  the  same 
pulse  wave.  Note  the  competetlon  between  the  straight  line  morph  HOR  and  the 
parabolic  CAP  for  representation  of  the  data  In  the  interval  [110,120).  Note 
also  that  interval  [77,103]  is  represented  by  a cap  and  a line  while  interval 
[168,190]  is  represented  by  a line  and  a right  shoulder.  Tills  is  somewhat  alarm- 
ing, shouldn't  the  human  body  be  behaving  the  same  way  during  both  these  intervals? 

Table  5 supports  the  claim  that  higher  level  knowledge  can  be  used  to  make 
lower  level  detection  more  efficient.  In  comparing  total  analysis  (detection  + 
syntax  + semantics)  of  158  waveforms  witli  detectors  testing  (no  syntax  or  semantics) 
on  20  different  training  waveforms  it  is  apparent  that  the  interaction  of  higher 
level  processes  in  the  detection  process  can  pay  for  Itself  in  terms  of  computation 
times.  The  syntactic  and  semantic  modules  of  WAPSYS  were  as  described  in  Section  3. 

Performance  of  the  detection  module  of  WAPSYS  was  excellent  on  the  20  training 
samples.  Performance  degraded  on  the  test  set:  some  morphs  on  some  of  the  waveforms 
were  badly  fit  and  some  of  the  waveforms  were  not  parsed.  in  many  of  these  cases 
failure  was  the  desired  outcome.  Apparently  our  linguistic  model  for  one  region  of 
the  data  was  inadequate  and  no  easy  fixes  are  evident. 

Due  to  infrequent  failures  in  the  modeling  and  computer  times  longer  than  real 
time  a system  such  as  WAPSYS  may  be  inadequate  for  monitoring  of  sensors  but  may  be 
quite  good  for  research  or  other  interactive  off-line  operation.  Graphics  such  as 
in  Figure  38  are  easily  available  to  the  user  and  the  analysis  readily  communicated 
due  to  the  linguistic  description  obtained.  Approval  of  the  automatic  analysis  saves 
tiie  user  time.  Analyses  unacceptable  to  the  user  can  be  easily  adjusted  due  to  the 
conmon  linguistic  model  between  man  and  machine. 
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PKRWIRMANCK  OK  FITTER  ON  20  PUl.SK 
WAVKS  TRAIN  INC  S A MITT  S 


number  of  morphs  440 
total  number  ol  fits  tried  8771 
total  number  ol  detections  157b 
total  number  ol  perfect  matches  I8t> 
average  best  match  0.044 
number  ol  morphs  detected  440 
number  ot  morphs  correctly  detected  440 
total  seconds  ot  computer  time  , , 


r llnlvac  1108  computer  used 
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WAPSYS  FXKODTION  OAT  A I'OR  SF.VF.RAt. 
ANA1.YSK.S  OF  bO  CYCLES  OF  I’lH.SK  W.VKS* 


» 

Total 

F 1 1 s 

Tot  a 1 

Detect  Ions 

llnlvac  1108 

Sup  Time 
lu  Seconds 

Per  bl)  Cycles 

i 

Number 
| ol  Samples 
Kaseil  On 

detector  training 

44d 

44T 

12 

20  training 

detector  testing 

8800 

IbOO 

210 

20  training 

total  analysts* 

b800 

2600 

1 

190 

! 58  test! ng 

♦Primitive  detection  (no  syntax  or  semantics)  for  20  training  samples  (bOcycles) 
is  compared  with  total  analysis  (detect  Ion  + syntax  + semantics)  ol  158  testing 
samples.  Execution  time  is  normalized  tit  bl)  cycles  of  data  for  compar Ison.  The 
"cycle"  has  a constant  syntactic  character  but  can  vary  widely  In  number  ot  points. 
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5.1.5  Performance  of  curve  fitting  In  2 -D 


Extraction  of  primitives  via  curve  fitting  in  2-11  was  briefly  discussed  in 
Section  3.4  with  respect  to  the  detection  of  an  airplane  wing.  A more  detailed 
example  is  given  here  which  shows  how  the  1-1)  curve  fitters  of  WAPSYS  were  gotten 
to  work  on  boundary  curves  from  2-D  imagery. 

Imagery  containing  a parabolic  boundary  curve  is  shown  in  Figure  39.  The 
hypothesis  to  be  verified  is  that  the  curve  runs  from  point  (38,49)  to  point  (30, ■IS) 
of  the  image.  In  this  case  the  hypothesis  was  generated  by  the  author  and  not  by  a 
PRR  model  as  would  be  the  case  with  totally  automatic  processing.  The  hypothesis 
is  used  to  establish  a new  coordinate  system  as  shown  in  Figure  19.  The  image  grad- 
ient is  then  sampled  along  profiles  generated  along  constant  x coordinates  in  the 
new  coordinate  system.  From  each  profile  the  highest  above  threshold  gradient  magni- 
tude is  selected  as  the  best  point  on  the  possible  boundary  curve  and  the  entire  point 
set  W= t(x  ,y  ) ) i = l ,N  is  passed  to  the  curve-fitter.  Figure  40  shows  the  points  select 
ed  on  the  gradient  profile  from  the  image  of  Figure  19.  Since  only  one  point  (x.,y.) 
is  selected  on  the  i— tit  profile. W is  itself  a function  as  is  a waveform.  Exhaustive 
search  of  W for  a parabolic  CAP  (Figure  37)  is  shown  in  figure  41.  All  45  possible 
Intervals  of  20  or  more  points  are  tried.  Most  fits  |a,b|  are  failures  (\=0)  even 
though  the  points  do  lie  along  a parabola  because  of  the  constraint  that  Y (a) *Y^(b) . 

A good  fit  is  achieved  on  [3,221  and  is  later  suppressed  by  a dominating  fit  on  (1,271 . 
The  value  of  that  fit  is  1.00  via  comparison  with  a given  noise  allowance.  The  fitted 
curve  runs  from  points  (58,54)  to  (32,54)  in  the  image  and  correlates  0.544  witli  the 
original  hypothesis  vector  ( 58,49)  ►( 30, 49)  . 
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Figure  39.  Subwindow  of  GAFB  image  containing  a parabolic  curve 
(Hypothesis  is  that  curve  extends  from  point  (58,49) 
to  point  (30,49)  and  lias  negative  curvature.) 
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Figure  4() _ Profiles  of  the  gradient  sampled  with  respect 

to  the  xy-coord inate  axes  labeled  in  Figure  jq, 
(Points  of  peak  gradient  are  input  to  curve  fitter.) 
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Figure  41. 
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A PRR  model  was  actually  used  to  generate  hypotheses  for  CAP  detection  in 
looking  for  airplane  wings  as  discussed  in  Section  3.4.  That  experiment  exposed 
a major  flaw  in  the  point  sampling  technique.  Because  only  one  point  was  selected 
on  a profile  the  globally  best  set  of  points  was  usually  missed.  This  is  evident 
in  Figure  40  where  a smoother  set  of  points  can  be  chosen  by  man  once  the  global 
structure  is  seen.  In  the  airplane  wing  this  phenomena  was  particularly  acute 
where  the  parabolic  tip  joined  the  straight  side. 

Clearly  a truly  2-D  approach  is  required  but  there  are  more  nuissances  in 
2-D.  Cooper  (1976]  discusses  true  2-D  detection  and  isolates  3 problem  steps. 

PI)  Given  a fit  on  n points  in  the  plane,  the  algorithm  must  choose 
the  next  point  for  possible  extension  of  the  fitted  curve.  (This 
is  not  in  general  easy  to  do  and  there  is  interaction  with  P3.) 

P2)  Once  the  next  candidate  point  is  chosen  the  parameters  of  the 

fit  must  be  updated.  (Not  too  hard,  usually  efficient  recursive 
updating  is  possible  for  practical  models.) 

P3)  By  using  the  likelyhood  value  for  the  fit  (or  its  derivative) 
it  must  be  decided  if  encompassing  the  new  point  is  a good 
alternative.  (This  so  called  stopping  rule  is  a major  problem 
because  it  involves  the  interaction  of  models  at  a joint. 
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It  may  bo  that  t ho  traditional  hypothesis  tont lng  via  curve  t it  ting,  which 
sooms  at  iirst  to  ho  tho  natural  thing  to  do,  is  roally  too  involved  and  hence 
trail  in  lieu  ot  other  alternatives.  Making  cruder  detections  via  Hough  techniques 
is  discussed  next  in  Section  5.2  and  Section  5.1  presents  a highly  practical  alter- 
native in  cases  where  the  curve  shape  is  highly  constrained  a priori. 


5.2  Hough  detection/verif ication  of  other  curves. 


Stereotyped  curve  structures  such  as  segments  of  ellipses,  parabolas, 
hyperbolas,  and  circles  can  be  detected  using  the  Hough  transform.  Because  it 
is  essentially  template  matching.  Hough  detection  will  be  robust  when  a large 
number  of  points  lie  on  the  curve  segment.  Because  an  entire  set  of  parameter- 
ized templates  are  used  at  once  variation  in  the  curve  from  an  a priori  shape 
is  permitted  and  a best  set  of  parameters  can  be  chosen. 

The  parametric  forms 
2 2 

x + axy  + by  + cx  + dy  + e = 0 
2 2 

or  a x + bxy  + y + cx  + dy  + e = 0 

can  be  used  to  specify  any  of  the  forms  - - circles,  ellipse,  parabola,  or  hyper- 
bola jRees  19b3) . Estimating  five  parameters,  however,  will  be  expensive  and  frail 
even  with  the  Hough  technique.  Small  segments  of  all  the  above  mentioned  curves 
might  be  approximated  by  a circular  arc  reducing  the  number  of  parameters  to  only  3. 
In  their  work  on  the  recognition  of  manufactured  parts  Tsuji  and  Matsumoto  (1978) 
were  successful  in  detecting  elliptical  curve  segments  by  using  a hierarchical  ap- 
proach combining  Hough  detection  with  curve  fitting.  In  a first  stage  2 center  para- 
meters are  detected  using  the  Hough  transform.  Given  a chosen  center  a second  Hough 
detection  stage  determines  a third  parameter.  All  points  consistent  with  the  3 para- 
meters chosen  in  the  first  two  stages  are  then  fit  with  a full  5 parameter  elliptical 
model  to  determine  the  final  detection  decision.  Finally  the  actual  extent  of  the 
curve  must  be  extracted  from  the  set  of  all  points  on  the  full  elliptical  template. 


5.2.1  Hough  circle  detection 


2 2 2 

A circle  (x-x  ) + (y_yc)  = r Is  specified  by  the  three  parameters 

(x  ,y  ,r).  Kirame  et  al  [1975]  have  presented  a practical  method  for  Hough  circle 
c c 

detection  which  includes  use  of  gradient  information  for  sharpening  the  transform 
and  ultimate  extraction  of  points  on  the  detected  curve.  A Hough  circle  detection 
algorithm  is  presented  below.  The  effective  parameter  space  is  highly  constrained 
by  the  verification  hypothesis.  High  gradient  points  are  then  selected  only  from 
a restricted  region.  Each  possible  circle  is  represented  by  one  accumulator  or  bin 
which  es*entially  defines  a template  or  mask  laid  on  the  image.  Figure  42  shows 
1 such  templates.  For  each  selected  high  gradient  point  it  is  recorded  which  hypo- 
thetical circles  are  supported  by  the  point  evidence.  If  the  point  evidence  is 
consistent  with  the  hypothetical  circle  the  appropriate  bin  is  incremented.  For 
example,  the  point  evidence  (x,y,d)  in  Figure  42  supports  the  existence  of  circle 
(x^.y^.r^)  but  not  circles  (x^.y^.r^)  or  (Xj.y^r,). 


Begin  Hough  c ircle  detection 
. Initialization 

Select  a set  S of  high  gradient  points  from  a certain  area 
of  the  image.  S » {(x . , y ^ ,d  ) } , i=l ,N  where  d.  is  the  direction 
of  the  gradient  at  (x^y^) 
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ond 


Establish  an  empty  set  of  accumulator  arrays  for  the 
discretized  (x,y , r) -parameter  space  where  r is  the 
radius  of  a circle  with  center  at  (x,y).  (The  vari- 
ation of  parameters  x,y  and  r are  highly  constrained 
by  the  hypothesis  being  verified.) 

. Distribution  of  local  evidence 

For  each  point  (x^y^d.)  in  S do  block  A 

For  each  possible  center  (x^)  compatible  with 
(x.,y.)  and  direction  dt  do  block  B 

Compute  r = ((xTx  )2  + (v.-v  )2)^ 

1 c i ' c # 

Distribute  mass  to  (x  ,v  ,r)  in  set 

C * c 

of  accumulators. 

end  block  B 

end  block  A 

• Detection  of  global  structure 

Smooth  the  accumulator  array.  Detect  above  threshold 
accumulator  array  peaks. 

Hough  circle  detection 


no 


etc.)  It  will  be  necessary  to  restrict  the  template  used  to  some  subset  of  its 
original  coverage.  This  is  perhaps  best  done,  not  by  changing  the  parameter- 
ization, but  by  masking  the  region  from  which  points  are  selected.  For  an  ex- 
ample consider  Figure  43.  Verification  of  the  existence  of  a circular  arc  in 
the  shaded  region  is  desired.  The  shaded  region  is  determined  by  the  circular 
(actually  ring-shaped)  template  intersected  with  two  halfplanes  h^  and 
Points  selected  as  evidence  can  easily  be  confined  to  any  convex  polygon  of  n 
sides  by  application  of  n linear  inequalities  (halfplane  tests) . Any  accumu- 
lated evidence  for  global  structure  is  then  guaranteed  to  be  dense  in  a restrict- 
ed region  rather  than  being  disperse  in  a complete  template. 


5.3  Verification  of  particular  boundary  curves 

When  searching  the  data  for  a particularly  shaped  boundary  or  linear  feature 
neither  curve-fitting  nor  Hough  detection  are  appropriate.  Not  only  might  there 
be  too  many  parameters  for  robust  detection  but  also  there  is  no  need  to  bear  the 
greater  expense  relative  to  the  practical  alternative  given  below.  Here  we  are 
talking  about  verification  of  features  whose  shape  is  precisely  known  such  as  the 
path  of  a stream  or  the  wing  of  a certain  airplane. 
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5.3.1  A practical  verification  technique  for  particular  curves 

As  a first  example  consider  the  verification  of  the  existence  of  Sherman 
Creek  in  imagery  of  Harrisburg,  Penna.  (refer  to  Figure  17  ).  It  is  easy  to 
store  the  path  of  the  stream  as  an  ordered  set  of  points  in  some  coordinate 
system  as  is  done  in  cartographic  data  bases.  Given  a registration  of  imagery 
containing  the  creek  to  the  cartographic  coordinate  system.  It  is  easy  to  trans- 
form the  points  of  the  mapped  feature  to  the  points  (pixels)  of  the  image  where 
the  feature  should  be  found.  Due  to  noise,  distortion,  approximation  in  the 
registration  transformation,  and  actual  change  in  the  stream,  it  is  unlikely  that 
the  feature  points  will  be  found  exactly  where  they  are  predicted  to  be.  Thus 
some  tolerance  must  be  used  in  the  verification  and  the  verification  result  should 
be  an  evaluation  of  hi w far  the  detected  feature  points  deviated  from  their  pre- 
dicted position.  A root-mean-square  value  is  one  measure  of  ttie  match.  Let 
P = 1 p p ,...,p  l be  the  feature  point  set  in  map  coordinates  and  let  T be 

the  transformation  registering  the  imagery  to  the  map  coordinate  system.  Let  q be 

_1 

the  best  detection  of  point  in  the  neighborhood  of  T (p  ) in  the  image.  Then  one 
measure  of  the  verification  of  point  set  1’  is 

N 2 _1  J* 

D(P)  *>  (Old  (T  (p  ) , q ))/N) 
m=  1 « m m 

2 

where  d j ■;  the  squared  distance  between  the  two  image  points,  'lot  only  did 

Barrow  et  al  |1‘)77|  use  such  a verification  measure,  but  they  also  did  hill-climbing 
on  the  parameter  set  a in  order  to  get  a better  registration  T“*  . 
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As  a second  example,  consider  Che  verification  of  the  nose  of  an  airplane 
after  registration  with  a model  according  to  the  procedures  of  Section  4.  Points 
to  be  tested  on  the  nose  are  appended  to  the  model  as  a list  P = { p^,...,p  } de- 

fining  the  boundary  curve.  Once  T is  determined  as  in  Section  4,  T can  be  used 

u a 

to  define  where  the  curve  should  be  in  the  image.  The  image  is  searched  along  a 

normal  to  the  curve  at  point  T(  (p^)  f°r  t*'e  P0int-  (lm  of  highest  gradient  magnitude 

and  compatible  gradient  direction.  The  match  value  e(p)  can  then  be  evaluated. 

_1 

Figure  44  shows  the  nose  of  the  plane  in  image  AFB1 . lb  mapped  points  T (p^) 
(circled)  defining  a nose  are  transformed  into  the  image  space  and  t lie  gradients 
are  examined  along  profiles  normal  to  the  predicted  boundary  curve.  Points  of 

the  detected  boundary  curve  are  selected  on  these  profiles.  In  the  case  shown  in 
Figure  44  D(P)  =0.3  pixels.  The  RMS  distance  between  predicted  and  detected  posi- 
tions can  be  compared  to  a tolerance  and  converted  to  a match  value  in  the  range 
10,1). 

e (P)  = max  {0,1-  ^ } 

The  method  of  searching  along  normals  to  the  predicted  curve  to  detect  feature 
points  has  been  used  by  Perkins [ 1977 ) in  a parts  inspection  application. 

b.3.2  Continuation  of  the  airplane  detection  experiment 

An  experiment  in  airplane  detection  using  straight  edge  evidence  was  described 
in  Section  4.  Results  of  that  experiment  were  reported  in  Column  3 of  Table  3 and 
showed  that  not  all  model  edges  were  detected.  Several  ambiguities  existed  due  to 
considerable  rotational  symmetry  at  90°, 180°,  and  270°  off  of  the  true  airplane 
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Figure  44.  Verification  of  the  existence  of  .lit  piano  nose  in  AKBl  image 
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position.  In  particular  for  plane  Number  1 there  were  strong  responses  at  both 
0=145°  and  0=329°.  In  a second  phase  of  the  plane  detection  experiment  model 
edges  that  were  not  detected  in  the  original  bottom-up  feature  detection  were 
verified  top-down  under  the  possible  registration  transformations.  For  the  case 
in  point  in  Table  4.3  the  transformation  (0=329° ,xs=-108,ys=46)  explained  only  8 
of  the  10  model  edges  with  image  edge  evidence  while  the  transformation  (0=145°, 
xs=133 ,ys=-36)  accounted  for  only  6 of  10  model  edges.  It  was  expected  that  veri- 
fication would  strongly  enhance  the  object  detection  results  and  eliminate  ambigu- 
ities. In  particular  verification  of  the  tail  of  the  plane,  whose  edges  were  often 
missed  by  primary  detection,  would  eliminate  ambiguities  caused  by  rotational  symmetry 
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The  experimental  procedure  was  as  follows.  The  results  of  registration  (in 
Column  3 of  Table  3 ) were  written  out  to  a file  for  input  to  the  verification. 
Along  with  each  set  of  possible  registration  parameters  a was  recorded  exactly  which 
model  edges  were  not  explained  by  the  primary  detections  and  the  transformation  T . 
This  output  was  then  used  in  the  verification  phase  to  perform  focused  searches  in 
the  imagery  for  previously  undetected  edges.  The  previously  derived  match  values 
for  each  T^  were  then  adjusted  according  to  the  verification  evidence. 

5.3.3  Results  of  verification  of  airplane  detections 

Table  6 contains  a summary  of  the  results  of  verification  of  the  initial 
airplane  detections  listed  in  Table  3.  In  every  case  the  verification  process 
was  able  to  significantly  increase  the  confidence  in  the  detection  decision.  In 
subwindows  1 and  7,  where  no  planes  were  present,  no  new  positive  evidence  was 
gathered.  In  fact  failure  to  find  any  additional  edge  evidence  resulted  in  large 
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drops  in  match  weight,  to  3.4  and  11.1  respectively  on  a scale  of  100.  For  the 
five  subwindows  which  contained  planes  the  match  confidence  went  up  for  the  cor- 
rect detection  and  went  down  for  the  incorrect  alternative  detection.  In  each 
case  all  10  model  edges  were  found  for  the  correct  detection  while  exactly  7 edges 
were  found  for  the  alternatives.  The  3 edges  of  the  tail  were  not  found  in  all 
the  alternative  detections.  It  should  be  noted  how  important  the  tail  is  in  counter- 
ing the  rotational  symmetry  of  the  plane  and  in  differentiating  an  airplane  from  an 
intersection  of  roads.  Additional  verification  of  points  on  the  nose,  engines,  wing 
and  tail  tips  would  give  further  confidence  to  the  detection  decision.  Match  weights 
for  correct  matches  of  primary  detections  were  in  the  range  of  40  to  60  out  of  a 
possible  100,  primarily  because  the  Hough  detected  edge  elements  either  undershot  or 
overshot  the  true  endpoints  of  the  edge.  Match  weights  for  verified  edges  were  rough- 
ly in  the  same  range  due  to  slight  misalignment  of  predicted  positions  and  detected 

positions  on  the  edge.  Hough  detections  could  be  cleaned  up  to  remove  the  first 

_1 

"problem"  and  the  alignment  could  be  adjusted  to  remove  the  second.  Although  the 
code  for  performing  such  adjustments  was  actually  at  hand,  it  was  decided  not  to  com- 
plicate the  experiments  by  using  it.  The  results  obtained  so  far  are  conclusive  enough 
to  demonstrate  the  viability  of  both  the  registration  and  verification. 
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6.  Conclusions 


This  report  studied  the  use  of  models  in  image  analysis.  Models  are 
essentially  structured  a priori  information  which  can  be  used  to  interpret 
data  in  a manner  consistent  with  global  real  world  knowledge.  Potential  models 
are  evoked  by  features  extracted  by  primitive  bottom-up  processing.  Features 
studied  in  this  research  were  all  derivable  from  boundary  curve  segments  and 
included  straight  edge  segments,  circular  or  parabolic  edge  segments,  and  points 
of  high  curvature  on  boundary  segments.  Two  types  of  models  were  considered. 

One  type  of  model  was  the  formal  Problem  Reduction  Representation  (PRR)  or  Con- 
text Free  Grammar  (CFG).  The  second  type  of  model  was  the  iconic  model,  which 
corresponded  to  a cartographic  data  base,  in  which  the  form  of  particularly  shaped 
geographic  features  was  encoded.  Regardless  of  the  type  of  model  being  used, 
primitive  features  are  used  to  allign  a hypothetical  model  with  raw  image  data. 

All  further  image  analysis  is  then  made  by  verification  (or  denial)  of  hypotheses 
generated  (top-down)  from  the  model.  Verification  of  shape  features  in  the  imagery 
was  viewed  as  either  curve-fitting  under  constraints  or  template  (icon)  matching. 

The  study  attempted  to  draw  conclusions  about  an  entire  system  by  studying 
its  several  possible  parts.  Many  experiments  were  performed  and  a great  deal  of 
literature  was  reviewed.  As  a result  of  this  study  the  following  conclusions  have 


been  reached. 
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Totally  bottom-up  synthesis  of  an  Image  interpretation  is  at  best 
inefficient  and  at  worst  logically  impossible.  To  form  an  inter- 
pretation that  is  to  be  useful  is  to  integrate  an  instance  of  data 
with  a large  amount  of  stored  knowledge.  During  this  process  it  is 
possible  that  some  of  the  physical  evidence  will  be  ignored  or  even 
contradicted. 

Representing  real  world  knowledge,  particularly  for  use  by  a computer, 
is  a difficult  task  with  much  current  research  activity.  Use  of  gen- 
eral shape  models  such  as  PRR  or  CFG  has  shown  some  promise  but  often 
turns  out  difficult  in  practice.  In  the  current  study  the  use  of 
particular  icons,  i.e.  maps,  appeared  to  be  much  more  practical. 

Current  automatic  low  level  feature  detection  techniques  can  support 
complex  analysis  when  features  are  registered  to  an  iconic  (geographic/ 
cartographic)  data  base.  The  registration  process  must  consider  the 
feature  set  as  errorful  and  partial,  yet  be  able  to  arrive  at  correct 
decisions.  A clustering  method  for  arriving  at  correct  global  inter- 
pretations from  errorful  and  partial  local  evidence  was  developed  dur- 
ing the  period  of  study. 

The  clustering  technique  for  making  global  decisions  from  imperfect 
evidence  has  advantages  over  sequential  syntactic  procedures  and 
iterative  relaxation  procedures  because  multiple  competing  global 
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Hypotheses  can  be  simultaneously  weighed.  Incorrect  or  ambiguous 


A 


local  structure  can  not  propagate  because  all  evidence  is  consider- 
ed in  parallel  li.e.  order  independent). 

While  the  several  heuristics  used  in  this  studv  for  evaluation  of 
competing  interpretations  seemed  to  work  in  practice,  the  determin- 
ation of  confidence  in  a hypothesis  from  combined  evidence  remains 
a sticky  problem  in  theory  and  in  practice.  In  this  study  only  shape 
features  were  used,  but  in  a practical  system  shape  information  will 
be  used  with  say  color  and  non-imagery  derived  intelligence  data. 

Much  more  work  and  thought  is  needed  to  study  the  problem  of  making 
decisions  on  combined  evidence. 

The  most  promising  future  direction  for  reconnaissance  image  analysis 
appears  to  be  along  the  line  of  map-guided  image  analysis.  A huge  re- 
source of  iconic  real  world  knowledge  already  exists  in  cartographic 
data  bases.  This  study  showed  how  that  resource  might  be  used  in  auto- 
matic or  semiautomatic  image  analysis.  Further  work  in  this  area  is 
likely  to  advance  the  encoding  of  geographic  data  bases  to  enhance  the 
automatic  image  analysis  function. 
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